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TRANSLATOR’S PREFACE 


This is a translation of the Russian book ПРЕДЕЛЬНЫЕ PACIIPE- 
ДЕЛЕНИЯ ДЛЯ СУММ НЕЗАВИСИМЫХ СЛУЧАЙНЫХ BEJIM- 
ЧИН (1949). There are various points of contact with the treatises by 
Р. Lévy [76] and by Н. Cramér [21], but much of the material in the book 
has been hitherto available only in periodical articles, many of which are 
in Russian. The systematic account presented here combines generality 
with simplicity, making some of the most important and difficult parts of 
the theory of probability easily accessible to the reader. Beyond a knowl- 
edge of the calculus on the level of, say, Hardy’s Pure Mathematics, the 
book is formally self-contained. However, a certain amount of mathemati- 
cal maturity, perhaps a touch of single-minded perfectionism, is needed to 
penetrate the depth and appreciate the classic beauty of this definitive 
work. 

It is hoped that the English translation may serve both as a standard 
reference on the subject and as a text or supplementary reading for ad- 
vanced courses in probability. Part of the book may also be used to suit 
other needs. For example, Chapters 1 and 2 may serve as the basis for any 
rigorous course in probability. Readers who are interested in learning the 
fundamental facts about stable laws and the more general infinitely divi- 
siblelaws may then go on to $$ 16-18 and $$ 33-34. Those who are interested 
in the (weak) law of large numbers, the central limit theorem, and the 
analogous limit theorem leading to the Poisson law in their simpler formu- 
lations may find their needs met in $ 21. Those who are interested in asymp- 
totie expansions will need only Chapters 1, 2, 8, and 9; in particular, 
$$ 46-47, 49, and 51 are elementary and will be found useful for many appli- 
cations. 

Now a few words about the translation as compared with the original. 
There are two major textual changes in the English edition. The first 
occurs in § 32, where a mistake found in the original necessitated the dele- 
tion of several paragraphs there and thereafter. The details are explained 
in the second half of Appendix II. The second change occurs in $$ 46—47, 
where I have incorporated a substantial improvement from the 1951 
Hungarian translation; see the Translator's Note to Theorem 1 of § 46. 

Some minor corrections, including those of misprints, are made without 
mention; in a few places I have profited by the ITungarian edition which 
corrected some of the errors in the Russian edition. In other cases where 
I found fault with the Russian text, I have added a note in addition to, or 
instead of, changing the text. As a result, about fifty such notes are 
appended. These Translator’s Notes are also used to supply references 
omitted by the authors and to add further explanatory remarks. In one 
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case, namely in connection with Theorem 1 of § 32, where a rather long 
note would be needed, I have put the added material in the first part of 
Appendix II. 

Appendix I was written by J. L. Doob and should be of interest to the 
reader who may be puzzled by the measure-theoretic complications in 
Chapter 1. 

Of the many friends who have lent me assistance of one kind or another, 
the following persons deserve special mention: J. L. Doob, for a variety of 
advice and aid; F. J. Dyson, for consultations on the Russian language; 
G. A. Hunt, for critically reading the manuscript; J. V. Wehausen, for 
helping with the Bibliography; J. Wolfowitz, for encouragement in the 
rather thankless job of translating. Miss Madelyn M. Keady typed the 
manuscript expertly and tirelessly, and my only regret is that we did not 
fully utilize her flawless efforts, since the formula matter was reproduced 
directly from the Russian edition to reduce the cost of printing. The under- 
taking of the translation was part of a project at Cornell University in 
1952-1953, under a contract with the Air Research and Development 
Command, whose support is gratefully acknowledged here. 

K. L. C. 
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PREFACE 
1 


In the formal construction of a course in the theory of probability, limit 
theorems appear as a kind of superstructure over elementary chapters, 
in which all problems have finite, purely arithmetical character. In reality, 
however, the epistemological value of the theory of probability is revealed 
only by limit theorems. Moreover, without limit theorems it is impossible 
to understand the real content of the primary concept of all our sciences — 
the concept of probability. In fact, all epistemologic value of the theory of 
probability is based on this: that large-scale random phenomena in their 
collective action create strict, nonrandom regularity. The very concept of 
mathematical probability would be fruitless if it did not find its realization 
in the frequency of occurrence of events under large-scale repetition of 
uniform conditions (a realization which is always approximate and not 
wholly reliable, but that becomes, in principle, arbitrarily precise and 
reliable as the number of repetitions increases). 

Therefore the elementary arithmetical calculations of probabilities re- 
lating to games of chance, in the works of Pascal and Fermat, can be 
considered only as the pre-history of the theory of probability, while its 
proper history began with the limit theorems of Bernoulli ([3], 1713) and 
de Moivre ([86], 1730). The fundamental importance of the result of de 
Moivre was completely revealed by Laplace ([72], 1812). To the limit 
theorems of Bernoulli and de Moivre-Laplace it is natural to add three 
more limit theorems of Poisson as the principal achievements of the theory 
of probability before Chebyshev. One of them generalizes the theorem of 
Bernoulli, another the theorem of de Moivre-Laplace, and the third leads 
to the so-called Poisson law of distribution. For a clear understanding of 
what follows it is useful to cite here somewhat modernized formulations of 
the five limit theorems enumerated above. 

The first four deal with a sequence of zndependent events 


bi, б, £5, .. 


We shall denote the probabilities of these events by 
Pn = P (8n), 
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and the number of actually occurring events among the first n events 
б, б, etr) $, 


by un. In the first two theorems all p, have the same value p (р = 0, p = 1). 
1. BERNOULLI’S THEOREM. For every e > 0 


En 
Р( z 
as п — о. 


2. LAPLACE's THEOREM. 


reste] [5 





p|>e)-+0 


as n — oo uniformly with respect to zı and z2. 
In the next two theorems p, may depend on n, but subject to the con- 
dition that the series 


У, Р» (1 — Pn) 


diverges. We set 


| Pi Pat d Par Am 
pi (1 —p + Pa (1 —pe) +--+ + Pn (0 —pa) = Ba. 


3. Law or Lance Numpers IN Poisson’s Form. For every e > 0 
(E) 
as т — oo. 


n 
4. CENTRAL Їлмїт THEOREM IN Porsson’s For. 








1 — = 
Pfa< mI A «ad fe 2 dz 
1 2 Zn 
2, 
as n — œ uniformly with respect to zı and zz. 


The fifth of the theorems we are interested in deals with a scheme of 
events 


б, 
бо, боз, 
631, за, bsg: 


+ + 39 09 0$ 3.4 € € 8 0v 09 S + 
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in which the events in the same row are mutually independent and have 
the same probability p,, depending only on the index of the row. We denote 
by ул the number of events in the nth row which actually occur. 

9. Poisson’s Limit THEOREM ron Rare Events. If 


np, a 
as n — co, then 
ат 
Р (en = m) > mt^ 


By introducing the random variables 


| 1 if & occurs, 
є = А 
0 if 8 does not occur, 
we can write 


Us = 6g, F 65g, + e Ies 


in Theorems 1, 2, 3, 4, and 
Un = м +854 M +6, 


in Theorem 5. 

This makes it possible to include all five limit theorems enumerated 
above as very special cases of limit theorems concerning sums of independent 
random variables. 

The idea that the normal probability distribution 


2 


РС<2= 7 fe Faz, 


-00 


which turned out to be the limit in Theorems 2 and 4, must also appear in 
a more general problem about the limit distribution of the sum of a large 
number of individually negligible independent summands is one of the 
essential ideas of the theory of errors developed by Gauss. ITowever, in 
the matter of rigorous proofs Gauss did not reach results equivalent to 
the theorem of de Moivre-Laplace. 

Effective methods for the rigorous proof of limit theorems concerning 
sums of arbitrarily distributed independent variables were created in the 
second half of the nineteenth century by Chebyshev. His classical work 
opened a new period of development of the entire theory of probability. 
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All of Chebyshev’s efforts were devoted to the solution of two prob- 
lems. Consider a sequence of independent random variables 


ra 
А 
К 


lu gU sioe ani ates 
having finite mathematical expectations 
n = Mtn 
and finite variances 
ba = О" = M (E — an)” 
Put 
„= b+ + 44 
Аһ = 8а а, 4+... 4-а, 
Br = Oi b+... dn: 
First PnonBLEM. What additional conditions ensure the law of large 


numbers: for every e > 0 
P( 
as п — оо? 


Seconp PnRonLEM. What additional conditions ensure the central limit. 
theorem: 


nAn 
n n 








>:) > 0 





Р (4 о) eta 


as n — œ uniformly with respect to 2? 
For application to the first problem the method developed by Cheby- 
shev in his work ([16], 1867) requires only the condition 


В, = o(n), 


This is usually called Markov’s condition, since Markov first pointed out 
clearly the degree of generality of Chebyshev’s reasoning. The law of large 
numbers under Markov’s condition not only includes Theorems 1 and 2 
of Bernoulli and Poisson, but in the great majority of applications more or 
less completely settles the question for sums of independent summands. 

The solution of the second problem was considerably harder. For it 
Chebyshev created the method of moments, which is one of his most 
important achievements in mathematics. The solution given by Chebyshev 
in his paper ([17], 1887) is based on a lemma which was proved only later 
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by Markov ([82], 1898). Soon afterwards the second problem of Chebyshev 
was solved by Lyapunov under considerably more general conditions by 
another method ([79], 1900; [80], 1901). Subsequently Markov succeeded 
in proving that the method of moments is capable of giving as general a 
result as that obtained by Lyapunov. However, the method of Lyapunov 
turned out in its further development to be much simpler and more power- 
ful in application to the entire circle of problems concerning limit theorems 
for sums of independent variables. This is the method of characteristic 
functions, which is the principal method employed in our book. 

The solution given by Lyapunov satisfies all the needs of the great 
majority of applications. Nevertheless, we shall give instead of Lyapunov's 
theorem the solution of Chebyshev's second problem in the form of 
Theorem 4 of $ 21. The condition used there, namely Lindeberg's condition 
that for every e > 0 


^ 
lim У Р (18, — а, |> Bp) — 0, 
n -> oco k—1 i 





is somewhat broader than Lyapunov’s condition. In its logical structure it 


is even simpler than Lyapunov’s condition 





where 


Let us turn to the simpler special case of a sequence 


m £o, Е. Ёл ae 


of independent identically distributed variables. In this case, the central 
limit theorem is applicable without any additional conditions other than 
the mere existence of the mathematical expectations 





аһ == а 
апа уагіапсеѕ 
b, =b 


(see Theorem 4 of 5 35). However, it is erroneous to conclude, even for the 
case of identically distributed summands, that there exist no really inter- 
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esting limit theorems in which the limit laws are different from the normal 
law.t 

In order to show by an example that such an opinion is only deep-rooted 
prejudice, we now consider the simple, classical scheme of random motion 
on a straight line, corresponding to the game of “heads or tails": 


1(0) = 0, 


1 (£2) 4- 1 with probability 4 
1044 1) = 
1 (4) — 1 with probability + 


independently of what 
n(1), 4(2), ..., 1@ 
are. 

It is well known that this scheme is the simplest of a long series of 
random motion schemes which have great importance in the most varied 
applications of the theory of probability, very remote from games of chance. 

We number in an increasing sequence all the values of ¢ for which 


7 (2) =Q. 
We obtain (with probability one) an infinite sequence 
0=17<7,<1<...<17<... 
The differences 


T 





En = Tn "n-1 
form a sequence of independent and identically distributed random vari- 
ables. Each of the variables £, takes only positive even values with proba- 
bilities 
2m (2m —2)! 

29m (m)? 


Pm =P Gn = 2m) = 


Since 

1 
2 Vim | 
asymptotically as n — ос, the mathematical expectation 


Pm ~ 


Мь=2 * mpa 


m=i 


is infinite. Nevertheless, the sums 


эй: “ыы К A 
А tn =F + pt... bn, 
T Translator's note. The word “law” is taken to be synonymous with “distribu- 


tion” in such contexts. In “the normal (or Poisson) law” often the corresponding 
type (see § 10) is meant. 
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with suitable normalization, are subject in the limit to a completely de- 
termined law of distribution: 


| 0 for 


im P (2 < г) 
n>m 7z 





3 
fe =, "dz for z>0 


(see in this connection Theorem 5 of § 35 and the end of § 34). 
The reader should turn his attention to n? in the denominator of the 
expression 
2n, 
zn?" 





In the case of the sum {„ of identically distributed independent variables 
with finite variances the denominator of the expression 


6 — An 
B, 





in the central limit theorem would have the order Vn. Comparison of 
these two special cases compels us to pose this general problem: Under 
what conditions on identically distributed independent variables 


DE ME MEN 


can a limit relation 


Р{ pA An «z |>) 
n 

hold, where A, and B, are constants, and what kind of limit laws V(2) 
can appear? i 

The question about the class of limit laws which can possibly appear 
in the situation indicated above was completely settled by A. Ya. Khint- 
chine. It turned out that up to linear transformations this class consists 
only of the normal law, occupying a special position; the unitary law 


0 for «<0, 
у= {| fo х>0; 


and a family of distribution laws with infinite variances, depending on 
two parameters (a and f in the notations of Ch. 7). All these distribution 
laws, called “stable” because of circumstances which are explained in § 33, 
deserve the most serious attention. It is probable that the scope of applied 
problems in which they play an essential role will become in due course 
rather wide. 
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Poisson’s limit theorem for rare events should long ago have suggested 
that even in the case of finite variances there can exist interesting and useful 
limit theorems concerning sums of independent variables and leading to 
distribution laws essentially different from the normal. To obtain them in 
a systematic way, it is natural to turn to the scheme of a double sequence 
of random variables 


(Enis Snare ee бать) п== 1, 2, 3,..., 


where the random variables of the same row аге independent, and to con- 
sider the sums 


Cn = bait Eno t- t Tm, 


The simplest and most important case is that in which all variables 
t,x in the same row are identically distributed. The problem consists as 
before in classifying the conditions under which a limit relation 


P (exo V (z) 


can hold and what kind of laws V(z) can appear. Here, of course, it is 
natural to consider only the case where 


I, ОО. 


It is curious that if all random variables §,, can take only two values 
z' and x” independent of the indices n and k, then the only possible limit 
laws (up to a linear transformation) will be the normal law, the improper 
law e(x) and the family of Poisson laws with one parameter a (see Kozul- 
raev [70]). 
^ The class of possible limit laws in such a formulation of the problem 
coincides with the class of infinitely divisible laws, to which Chapter 3 is 
devoted. Naturally, it contains all the stable laws and Poisson’s law. The 
corresponding limit theorems are proved in Chapter 4. Here we only 
mention that for a better understanding of their intuitive meaning it may 
be useful for the reader to become acquainted with a special case treated 
in the book of A. Ya. Khintchine [53] under the name of “generalized limit 
theorem of Poisson." This elementary limit theorem leads only to those 
infinitely divisible laws with characteristic functions of the form 


f (t) = exp [е em — Dar (и) \ 
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(see § 16). Distribution laws of this type have as many finite moments as 
does their generating distribution F(u). It is possible to indicate many 
physical and technical problems leading to them. 

Among infinitely divisible laws, belonging neither to the class of stable 
laws nor to that of laws of the special type just mentioned, we mention 
also a family of distributions well known in mathematical statistics. They 
are given by the incomplete gamma functions 


0 for 2<0, 


V(z)— |f 
(z) Tun|z-!e-de for 2>0, 
I' (а) 

0 


depending on the parameter a > 0 (see Example 4, § 17). To this family 
belongs in particular (for a = 1), the exponential distribution 


0 for 2<0, 
via=| ]1—e-* for z>0. 

If we renounce the assumption that all the random variables in the 
same row have the same law of distribution, then the problem of deter- 
mining all possible laws V(z), in its exact formulation above, becomes 
meaningless. The limit law V(z) can be absolutely arbitrary. This is indeed 
natural, since now the requirement m, — oo is illusory. It does not prevent, 
for example, that in each row one single summand £,; plays the dominating 
role. Meaningful results, comformable to the original lofty conception of 
the classical limit theorems in the theory of probability, are obtained oniy 
under the following additional requirement: for every є > 0 there should 
exist constants а such that 


sup Р{| к——а„к| > € Bn} > 0. 
ї<К< тд 


This requirement of the “asymptotic negligibility” of the variation of each 
individual summand in comparison with the chosen seale В, for the sum 
£4 is quite natural. In $ 20 it is introduced in the particular case В, = 1 
under the name “asymptotic constancy.” 

A. Ya. Khintchine proved that with this restriction the only possible 
limit laws in the case of arbitrarily distributed terms are the same infinitely 
divisible laws as in the identically distributed case (§ 24). Therefore it is 
quite natural that the infinitely divisible laws turn out to be the central 
concept throughout the first part of this book. It seems to us that the 
theory of these laws and the general limit theorems connected with them 
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will receive in time diverse applications. One of the present authors intends 
soon to publish in a separate article a survey of those applications of the 
stable and infinitely divisible laws which have already been found. 
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In all practical applications limit theorems are used essentially as 
approximate formulas for finite though sufficiently large values of n. In 
order that such an application be completely justified, the formulas should 
be provided with estimates of the remainder terms. If the remainder terms 
decrease slowly as n — со, then it becomes necessary to introduce, for finite 
n, corrections to the limit distribution V(z). The most powerful and general 
method of finding such corrections is to consider the various asymptotic 
expansions for the distribution 


V, (2) =P (a <2). 


For the classical central limit theorem such an asymptotic expansion with 
terms of order 


"ye Ga (Ут 


was indicated by Chebyshev himself without a sound basis. The recent 
development of his idea is traced in Chapter 8. 

In Chapter 9 are discussed various other directions for the improvement 
of the limit theorems. Up to now there have been great achievements only 
in the improvement of the classical limit theorem concerning convergence 
to the normal distribution. The improvement of new limit theorems is 
given only in the direction of “local” theorems concerning convergence to 
the stable laws ($ 49). 

The above gives a sufficiently precise outline of the scope of questions 
dealt with in this book. Within these limits we strive for exhaustive com- 
pleteness wherever the results achieved at the present time seem to have 
definitive value. In those problems where now there are only results which 
will probably be strengthened in the near future, or where it is likely that 
novel formulations will combine great generality with great simplicity, 
we have confined ourselves to considering the simplest special cases illus- 
trative of the nature of the problems. For this reason we consider, for 
example, the question of improving the limit theorems and local limit 
theorems only for the identically distributed case, leaving further in- 
formation to periodical articles. 

Generalizations to several dimensions and to sums of dependent vari- 
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ables are outside the scope of our book. A complete exposition of the re- 
sults obtained in these directions by Markov, S. N. Bernstein, and their 
followers would require another volume of the same size as this, if we treated 
in full the limit theorems connected with Markov chains. 

The striving, peculiar to the latest researches of our chosen field, for 
complete generality and logical perfection, and for the discovery wherever 
possible of conditions which are both necessary and sufficient, seems com- 
pletely justified in view of its central position in the entire theory of prob- 
ability. It is natural to sharpen our methods of investigation to the fullest 
extent on the testing ground of these classical problems and their immediate 
natural generalizations. Our book therefore has a theoretical nature: the 
topics selected are investigated systematically, whether or not all their 
developments have applied value at the moment. 

We believe, however, that a great many cases of limiting behavior 
which seem to be introduced here only for the sake of exhausting all logical 
possibilities will also receive diverse applications in time. Some indications 
in this direction have already been given above. At any rate, there is no 
doubt that the arsenal of those limit theorems which should be included 
in future practical handbooks must be considerably expanded in com- 
parison with classical standards. Of course, it is necessary to make some 
choice. For example, “normal” convergence to the non-normal stable laws 
(see § 35) undoubtedly must already be considered in any comprehensive 
text in, say, the field of statistical physics. But the consideration of “non- 
normal" convergence with an irregular normalization even in the case of 
the normal limit law (see loc. cit.) would only unnecessarily overburden 
such a practical text. 

We mention also that we confine ourselves everywhere to the estimates 
of the order of the remainder terms, or in better cases to their asymptotic 
behavior, instead of giving estimates in the form of precise inequalities. 
In order to pick out this or that limit theorem as having immediate prac- 
tical value, it would be necessary to fill this gap. But to trace such esti- 
mates systematically throughout our exposition would be oncrous, and 
we preferred not to pause at all for them in this book. 
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It remains for us to mention that this book is based on the lectures 
given by us in Moscow and Lwow Universities, but in the final editing 
very great help was rendered us by U. V. Prohorov, who is responsible for 
a very great number of essential improvements in the formulations and 
proofs. We express our gratitude to him for this work. 


Parti INTRODUCTION 


CHAPTER 1 


PROBABILITY DISTRIBUTIONS. 
RANDOM VARIABLES AND MATHEMATICAL EXPECTATIONS 





$1. PRELIMINARY REMARKS 


In the basic chapters of this book we study exclusively the probability 
of events the occurrence or nonoccurrence of which is uniquely determined 
by the values of a finite number of real random variables 


Br ag ses. Sg Bare 


In this kind of question it is possible to get along without any more general 
basic concept than the probability distribution of the random point 


t = (Ё, ba, EX tn) 


in the n-dimensional Euclidean space R”. Such a distribution is given in a 
most logical way by means of the function 


P; (A) = Pu, . +t, (4) 
of the set А C R”, which is understood to be the probability of the event 
EE A. 
We impose the following restrictions: 


(а) The domain of definition €t of the function P;(A) is a Borel field of 
subsets of A" (see the definition in $ 2), containing as an clement the 
space №" itself. 

(b) The field €; contains all open sets of the space Л". 

(c) The function P(A) is non-negative and countably additive. 

(d) For any set A of €: the value P(A) is equal to the infimum of the 

values P:(@) for all open sets С containing A. 

(е) PR”) = 1. 

The first four of these requirements express the usual properties of measures 
in R”. The fifth requirement indicates that probability measures are 
always taken to be “normalized.” 

This concept of n-dimensional probability distribution is chosen as the 
axiomatic basis of all further considerations in, for examplo, II. Cramér's 
book [22]. However, even in the narrow frame of an exclusive interest in 
finite-dimensional distributions this approach is not without defect. One 
can be convinced of this by reading $$ 14.2 and 14.5 of Cramér's book. 
His axiom 3 compels us to suppose that the basic object, whose properties 
must-be fixed by means of axioms, is a certain unnamed collection of all 
random variables. In § 14.5 it 1s even “proved” that any Borel-measurable 
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function of random variables is itself a random variable. But it remains 
vague whether we mean here those random variables to which axioms 1, 
2, and 3 refer or random variables in some new sense. It would be possible, 
of course, to avoid these obscurities at the expense of even stronger restric- 
tions. We could always start off with an initial probability distribution 
for a “basic” set of random variables 


gb Ep бк» 6n 


and strictly distinguish from these the ‘‘generated”’ random variables 


== / (8, io, ote dy ©) 


for which the distribution laws are calculated from the basic distribution 
P(A). 

A broader and more natural perspective and also the possibility of 
treating all random variables considered as having equal rights are opened 
only by a more general approach, developed, for example, in A. N. Kol- 
mogorov’s book [65]. We shall follow this system of exposition, in which 
all random variables considered are functions 


t =$ (u) 


of some abstract argument u. §§ 2—4 contain all we need of the general 
theory of probability distributions on arbitrary sets. The exposition in 
these sections is somewhat improved over that in [65]. 

A basic change in comparison with [65] is the use of the new concept of 
" perfect" measure, which is introduced in $3. This apparently abstract 
concept is introduced to bring the general theory into closer correspondence 
with the usual notion of distributions in concrete spaces (R!, R”, etc.). 
The reader can find in the book [69] the proofs of all the propositions 
formulated in $ 3. To this book the reader is referred for all the demonstra- 
tions concerning the theory of measure and Lebesgue integrals. 

Together with the distribution 


Ро... (А)=Р (E, &,..., 6) € А} 
itself we often use in the study of a system of n random variables 
ti, b&,..., & the so-called distribution function 
Бае... (а,, а», ob eg an) = P (5 «a, £o < tlg, ++.) 6, « s). 
This is nothing but the value of Pj... (А) for that part of the space R^ 
singled out by the inequalities 
Xy « ак, k= 1, 25 ©, P. 

The use of such an n-dimensional distribution function, on the whole, 
may be admitted to be an anachronism, preserved from the time when 


the notion of a set function was not sufficiently cultivated. 


We define the mathematical expectation of a random variable by the 
formula 
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Mt = f EaP, 
U 


where on the right side stands the Lebesgue integral over the set {7 of all 
“elementary events.” From this we easily deduce other forms of writing 
the mathematical expectation: 


ME = | xP; (dx) = f хав, (x) 


Ri 


and in case ¢ = f(n), where 7 is an auxiliary random variable, 


Mi= f f(y) P. (dy) = f f(y) dF (9). 
m 


The use of the general Lebesgue integral (over an arbitrary set and 
with an arbitrary measure) allows us to dispense with the elementary 
theory of the Stieltjes integral. To us this seems a great advantage, since 
a complete exposition of this last theory with all the details which are 
necessary for a correct basis for applications to probability, is very cumber- 
some (and, so far as we know, has never been given). We mean the follow- 
ing cireumstances: 

(a) In order to establish the composition formula for the distribution 
of sums of independent terms, the definition of the Stieltjes integral 
current in textbooks of analysis is insufficient (see in this connection $8 8 
and 10 and supplement I to book [32]). (b) The theory of integrals with 
infinite limits, which are essential in the theory of probability, requires 
additional stipulations in the elementary approach. (c) The equation 


Mt— f xaF o) = уо) аР, (у) 


with = f(n), is proved in the elementary theory in а very cumbersome way. 
And in order that the existence of the second integral.imply the existence 
of the first integral, it is necessary in the definition of the integral with 
infinite limits to impose restrictions which are essentially foreign to the 
elementary theory of such integrals. (d) The elementary theory of multiple 
Stieltjes integrals is little developed, and the questions (a), (b), and (c) are 
not yet fully treated in the periodical literature. By turning to the abstract 
theory of Lebesgue integral the necessity for all these tiresome and nar- 
rowly special investigations is eliminated. Since the distribution P(A) is 
uniquely determined by the distribution function F(a) (see § 6), by tradi- 
tion we preserve the notation 


fF) Pa» = | о) ағ). 
m 


In 86 it is shown that under certain conditions this integral becomes the 
limit of Stieltjes sums. For the calculation and estimation of integrals 
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these sums are often useful but with our approach they are not involved 
in the construction of the theory of integration. 

In § 7 is proved the theorem about the composition of laws of distribu- 
tion when independent summands are added. The proof, based on the 
theorem of Fubini, is applicable without change to the sum of independent 
vectors, and even to more general cases. 

§ 8 contains all the necessary information about integrals 


b 
ff) e (dx) and f f (x) d (x) 
A a 


in which ё(А) can take negative values and (x) is not monotone. 


§ 2. MEASURES 


In this section we remind the reader of the basic definitions in the 
theory of measure. In accordance with our later needs, we confine ourselves 
here to finite measures. 


DEFINITION 1. The system of sets Ф is called a field of sets, if (a) there 

exists UCP such that ACP implies ACU; (b) if ACP and BEF, 

then ANBECG.f 

It is easy to see that U is uniquely determined by the field of sets. 
It is the “unit” of the field Uy. To every field belongs the empty set 


N-—UNU. 


From (a) and (b) it follows further that the union and interseetion of 
any finite number of sets of the field belong to the field. 


Durinition 2. The field P is called a Borel field if the union of any 
countable system of sets of P belongs to Ф. 


It is easy to prove that the intersection of a countable number of sets 
of a Borel field B also belongs to $. 

To avoid misunderstanding we remark that, generally speaking, the 
set {u}, consisting of a single element 

-Uu € Us, 
may not belong to $. 

Finally, the possibility of considering only Borel fields in many questions 
is based on the fact that any system © of subsets of a set U is contained 
in à unique minimal Borel field with unit U. 

This minimal field 


$= By (©) 


is called the Borel closure of the system € with respect to U. 


T Translator's note. The symbol AV denotes the set of points in one but not in 
both of the sets A,B . 
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DEFINITION 3. A measure is a rea! non-negative set function g(t), 
for which 
(ul) The domain of definition 9)t, is a Borel field of sets. 


(u2) For any finite or countable number of mutually disjoint sets 
An»CM, we have 


BCU A,) = } н (А„). 


(иЗ) и(А)=0 and BCA imply BEM,. 
The set 
U, = Um, 


79 
is called the carrier of the measure p. 


In the theory of probability, where the most diverse distributions of 
random variables, random vectors, and random functions are obtained 
from the basic probability distribution P(A) on the set U of elementary 
events, the following definition has fundamental importance. 

DEFINITION 4. Suppose that the single-valued function 

u’ = (и) 

maps U into some set U’. The measure on U' generated by the mapping 

£ from the measure и is the set function 


p (A) = 62 (4), 
for which 
(1) The domain of definition M, consists of all sets 
A'cU' 
for which the complete inverse image * 


t (a) m; 


(2) 
ҥ'(А) = p (E7 (А)). 
It is easy to prove that the function p’ is indeed a measure, in the sense 


of Definition 3, for which 
On = U'. 


If the set U' is a metric (or topological) space, the mappings £ which 
are measurable with respect to u (or u-measurable), i.e., the mappings 
for which all inverse images 


ET? (G) 


* The complete inverse image £71(A") of a set A’ С С", into whieh no element 
of U, is mapped, is taken to be the empty set N. 
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of open sets GOU’ belong to M, deserve special attention. Thus, if 
tis a measurable mapping the system of sets M, contains all open sets 
of the space U', hence by the property (ul) of a measure it contains also 
all Borel sets of the space. This remark is the starting point of the con- 
siderations of the next section. 


$3. PERFECT MEASURES 


In the study of measures in metric (or topological) spaces it is usual 
to confine ourselves to the consideration of those measures which besides 
(ul), (22), and (иЗ) of our abstract measure possess also the following 
properties: 


(u4) All open sets of the space U, belong to W. 
(u5) The measure р(А) of any set ACM, is equal to the infimum 
of the measures u(G) of open sets G containing A. 


Both these requirements are meaningless in applieation to measures 
in an abstract set U,. At the end of the preceding section we saw, however, 
that the measure д’ generated by a u-measurable mapping of an abstract 
set U, into a metric space U’ always possesses the property (u4). To 
achieve complete harmony between the abstract theory of measure and 
the theory of measure in metric spaces it would be desirable that measur- 
able mappings of the carrier U, of an abstract measure into metric spaces 
yield measures possessing not only the property (u4) but also the property 
(u9). We now see that by means of some restriction on the class of abstract 
measures admitted to consideration this wish is realized, at least in applica- 
tion to mappings in the most common and important metric spaces. The 
required restriction is achieved by the following definition. 


The measure џ is called perfect, if the measure 
Bal 


possesses the property (u5) whenever £ is a measurable mapping of the 
set U, into the real line R!. 


The reasonableness of this definition is confirmed by the following two 
theorems: 


Тнеолем 1. If U, is a complete metric space with a countable basis 
and the measure u possesses the property (u+), then it will be perfect if 
and only if it possesses the property (u5). 


THEOREM 2. Any mapping E of the carrier U, of a perfect measure into 
an arbitrary set U' generates (in accordance with Definition 4) a perfect 
measure u'. 
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In case the mapping £ into the metric space U’ is measurable, the 
measure и’ necessarily possesses the property (рї); hence by comparing 
Theorems 1 and 2 we deduce 


COROLLARY. If the measure is perfect, then any measurable mapping & 
of the set U, into a complete metric space with a countable basis generates 
in this space a measure џ! with the properties (uA) and (u5). 


Theorem 1 shows that the class of perfect measures is broad enough for 
applications. The corollary of Theorems 1 and 2 shows that by confining 
ourselves to the general theory of complete measures and their measurable 
mappings, we shall obtain in the most natural and simple concrete cases 
only measures possessing the usual properties (u4) and (u5). 


$4. THE LEBESGUE INTEGRAL 


u-measurable mappings £ of the set U, into the real line are usually 
called measurable (with respect to u) functions. We shall consider as known 
the basic properties of the Lebesgue integral 


f ED p (du) (1) 
A 


of such functions over sets A belonging to the system Mt, (which are 
usually called u-measurable sets). The reader can find all necessary informa- 
tion about the Lebesgue integral in the book [69]. Here we remind him 
only of some facts, not well known but important in the theory of prob- 
ability. We note that in what follows finite Lebesgue integrals are always 
meant. When we say that the integral (1) exists, we mean that both it 
and the integral 


f 16601» (aa) 
A 


have definite finite values. 


THEOREM 1. Suppose that the mapping e of the set U, into U' generates 
in U' the measure ш’; that the set A'CCU' belongs to Mw; that the real 
function £' (u^) is defined on U’ and measurable with respect to u' ; and that 


£(u) —£' (e (и)) and A = e-71(A). 


Then 
few) p (du) = f E (u^) p’ (du’). 
A A’ 


Moreover, both integrals exist or neither exists. 
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THEOREM 2. If lu) > 0 on the set U, and the integral 


f E(u) p (au) 
U, 
exists, then the set function 


à (A) = f € (u) p (du) 
A 


is a measure with 


U, = U,, M, > Dy. 
Any u-measurable function ф(и) is also \-measurable and 


f # (ш) аш) = f t (и) е (u) p (du). 
А А 


Moreover both integrals exist or neither exists. 


THEOREM 3. If under the conditions of Theorem 2 £(u) > 0 everywhere 
on U,, then for every ACM, = M, 


p (A) = 
A 


The classes of \-measurable and u-measurable functions plu) coincide and 


ешки) f 
А А 


Moreover both integrals exist or neither exists. 





Ф(и) 
E(u) à (du). 


$5. MATHEMATICAL FOUNDATIONS OF THE THEORY OF PROBABILITY 


After the preparation made in the preceding sections we can state very 
briefly the assumptions which are necessary for the development of the 
theory of probability. 

The probability P(.4) is a perfect measure satisfying the additional con- 
dition of “ normalization ” : 


P (Up) — 1. 


This brief formulation takes the place of all the “axiomatics” of the 
theory of probability. 
The elements of the fundamental set 


U=U, 


are called in the theory of probability elementary: events, and a set A from 
the system 


S=M, 
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is a random event. From this it follows that ‘elementary events" are not 
“random events” (even in the case, which is by no means necessary. that 
the set {fu} consisting of a single elementary event u belongs to the sys- 
tem (9). 

By comparison with [65] and other expositions of the theory of prob- 
ability constructed on the same principle, the approach established here 
contains two new restrictions: (a) The probability must satisfy the require- 
ment (иЗ) included in the definition of a measure in $ 2. (b) The probability 
must be perfect in the sense of $3. Both these restrictions are useful for 
further developments of the theory and at the same time do not narrow 
the domain of really interesting applications. 

The conditional probability of a random event A relative to the occur- 
rence of the random event В with P(B) > 0 is defined by the equation 

P(A П 8) 

It is easy to prove (see [65]) that for a fixed B the conditional probability 
possesses all the properties of the ordinary “unconditional” probability. 

A P-measurable function £(w) of the elementary event u is called a 
random variable. 

By the very definition of a measurable function (see $ +) a random 
variable takes only real values. Only such real random variables are meant 
in the general considerations of this chapter. But the carrying over of the 
definition and the simplest properties of the mathematical expectation 
to complex random variables, i.c., to measurable mappings of the funda- 
mental set U into the complex plane, presents no difficulties. 

Mathematical expectations and conditional mathematical expectations 
are defined by the formulas 


Mt= [ЕР (du), M (E| B) = f E(u) P (dul B). 
у U Я 


In all that follows we shall denote random variables by Greek letters £. 
n, {, omitting the argument u. Instead of P(du) we shall write dP. In 
accordance with these conventions the definition of mathematical expecta- 
tion is written 


Mt— ftaP. 
U 


Calculations with conditional mathematical expectations are most. con- 
veniently performed on the basis of thc trivial equation 


MEIB)= sis | ЕАР. 
Жыш гы E BA 


We remark also that in the expressions 


Р(...) or fear 
623 
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(---) will always denote the set of those elementary events satisfying the 
relations in parentheses or under the integral sign. For example, 


ftap 
lel Ca 


denotes the integral of £(u) over the set of those u for which | £(u) | < a. 


$6. PROBABILITY DISTRIBUTIONS IN R! AND IN К" 
Any n random variables 
bie Say oc kg ien 
can be considered as an n-dimensional random vector 
pex Iac in) 


Obviously, n-dimensional vectors map the fundamental set U into the 
n-dimensional coordinate space E^. The measure 


P= Рё" 


arising from this mapping in accordance with Definition 4 of $ 2, is called 
the probability distribution of the random vector £, or the joint distribution 
of all the random variables £i, Ё, . .. , £,. In an expanded form the notation 
is 


Pi (A) = Р... 4, (A): 


It is easy to prove that every random vector maps U into Ё" in a measur- 
able way, i.e., the domain of definition 


©, = Mp, 


of the probability distribution P, contains all open (and consequently also 
all Borel) sets of the space R”. In accordance with $3 it follows from this 
that the measure P, possesses also the property (u5). Conversely, any 
measure и with 


n 
USR, 
possessing the properties (u4) and (u5) and satisfying the condition 
р (А) = 1, (*) 
can serve as the probability distribution of an n-dimensional random 
vector. For the proof of this assertion it is sufficient to take as the set U 
the space R” itself, to put 
Py, 
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and to take as the mapping £ the identical mapping 
E(u) = u. 


Measures іп R” possessing the properties (рі) and (u5) and satisfying 
the condition of normalization (*) are called n-dimensional distributions 
and are denoted preferably by the letter P. 

Of greatest practical importance are two special classes of n-dimensional 
distributions: continuous distributions and discrete distributions. 

A distribution P in R” is called continuous if it is representable in the. 
form 


P(A)= | р(х) ах, (1) 
u A 





where the sign dx denotes integration with respect to the ordinary n-dimen- 
sional Lebesgue measure, the function p(x) a Lebesgue summable function 
of points (vectors) rz€ R^. As is well known, the function р(х) is deter- 
mined uniquely by the distribution P up to its values on a set of measure 
zero. It is called the density of the corresponding distribution. If P = P; is 
the probability distribution of the random vector £, then 


p (x) = рє (x) 


is the probability density corresponding to the vector £. In an expanded 
form the probability density of the vector £ = (£i, 5, ..., En) is written as 


Pi (X) = Paige au OS Xs Ха) 


and is called the probability density of the system of random variables &, 
E, o En. 

The probability distribution P is called discrete if it is representable in 
the form 

P(A)= È рь (2) 
alk) € A 

where the x) are points of R”, finite or countable in number, and the p; 
are non-negative numbers (their sum, of course, must equal one). 

If the probability distribution P; of a random vector £ is discrete then 
all the points х for which 





P(5—x)O, 


are called its possible values. Obviously, only these possible values play an 
essential role in (2). 

We now turn to distributions on the real line Л!, i.c., to probability 
distributions P; of single random variables £. All that was said about 
distributions in R^ for an arbitrary n is applicable to this particular case 
n = 1. As we have already mentioned in $ 1, in the one-dimensional case 
it is often expedient to single out the value of the function P;C1) for the 
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sets A consisting of all the points of the real line R! lying to the left of 
some fixed point a, i.c., the (improper) intervals 


(—oo; а). 


Considering these values as a function of a, we obtain the distribution 
function 


F; (а) -- P(—00; а) =P(E< a) 


of the random variable £. 

The proof of the following two propositions can be found in [69]: 

I.* The one-dimensional distribution P(A) is uniquely determined 
by its corresponding distribution function 


Е (а) =Р ( — оо; а). 


II. In order that the function F(a) may be a distribution function, it is 
necessary and sufficient that it be nondecreasing for all a, continuous to 
the left, and have the limiting values 

F(—oo)=0, 
F(-+ со) —1. 


Obviously, if P is a continuous distribution the corresponding distribu- 


tion function is represented in the form 
a 


F (a) = Jr (x) dx, (3) 
— со 
where the integral must, generally speaking, be understood in the sense 
ot Lebesgue. 
In the case of a discrete distribution, given by (2), we have 
F(a)= Ў, рк. (4) 
x(k) <a 
Obviously, 
рк= F (x -- 0) — F(x), 
For an arbitrary distribution P(A) the analogous formula holds: 
P (К = x) = F; (x + 0) — Fy (х). 


Since the sum of probabilities of mutually exclusive events does not exceed 
one, the number of points at which 


F (x 3-0) — F (x) 20, 


i.e., the points of discontinuity of the function F(x), is at most countable. 
If F(x 4-0) — F(x) = 1 for some x, then the distribution is called im- 
proper. Every distribution which is not improper is called proper. 





* The proof of I is based on the property (u5) of the measure P(A), i.e., on the 
assumption that P(A) is a perfect measure. 
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Probability distributions are used in calculating mathematical expecta- 
tions. Thus, if the random variable £ is represented in the form of a Borel 
measurable function 


E= f (n. Nas erry Nn) 


of other random variables m, 75, .. . , na, then by Theorem 1 of $4 


Mt— f taP = f f(y) Pass, (45). (5) 
U R^ 
If the joint distribution of the variables т, 9, . . . , na 18 continuous, then 
by Theorem 2 of $ 4, Eq. (5) may be written as * 
ME= f fO) pua 0) dy. (6) 
R” 


If P(A) is a one-dimensional distribution with the corresponding 
distribution function F(x), 


b b 
[ F) P (ax) = f f(x) aF (x) 


will denote the integral 


f AOP (x) 


la; b) 
over the half-open interval 
[a; 0) — (a i x «b) 


If a= —оо, this is the interval (—~% < x < b). The integral 
b 
J AAF co 


as function of the upper limit is always continuous to the left and the usual 
relation holds: 


b e с 
J fear (ot [fear (ху = РО) ағ co. 
a b ü 


If the function f(z) is continuous in the closed interval (а; b], then the 
integral 


b 
I= fre dF (x) 


* The n-dimensional Lebesgue measure of the whole space R” is infinite. As 
proved in [69], the theorems of $- are applicable even to the case of measures ad- 
mitting infinite values. 
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can be calculated by means of the Stieltjes sums 
5 = Xf dF @) —Р (аЬ 
where : 
Ав 
тах (ay — ay ,) — 0 
the sums S converge to the integral 7. The integral 


ff) Р(ах)= f f (x) ағ (х), 
m 
if it exists, can be obtained as 


b 
lim [ f(x) аР(х\. 
а->— со а 
Ь->+ со 


These classical methods of calculating the integrals 


f FOP (ax) 
m 


are often useful. 


$7. INDEPENDENCE. COMPOSITION OF DISTRIBUTIONS 


Arbitrary measurable functions 
JE EM 
of the elementary event v are called independent if for any 
Ape Sy Ё==1, 2,...,п, 
the following equation holds: 


PY 0,64) | - HEP ead. 


[СНАР. 1 


(1) 


This definition is applicable to functions & which have values of any 
nature. They may be, for example, real, complex, or vectorial functions of 


the argument и. 


We shall be especially occupied with the case in which the £, are real 
random variables, i.e., measurable mappings of the set U into the real 


line R!. 


THEOREM 1. If the random variables £, &, ... , En are independent, then 
their joint distribution Pg... &, 18 uniquely determined by the distribu- 


tions 


Р; k=l, 2, oe n. 
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As a measure in 
R” =R XR!X... XR! 
Pitz... 78 the product of the measures P; in Ё!: 
Рав. = Ры X Pig Ж... X Pein- (2) 
Proof. For any set AER” of the form 
А= А, ХА, ХЖ... KX Ag, 
where 
Ак € ©; R= 1, 2,...,n, 


(1) immediately yields the formula 


Per: В ín (A) = Pa (А,) Py, (Ay) E Р, (A,). (3) 
All parallelepipeds in R”, defined by inequalities 
ap < Xy Z b,; R-1,2,..., n. 


are sets of this form. 

The measure Pg&...5(A) is uniquely determined throughout its 
domain of definition by its values for such parallelepipeds [this is true of 
every measure possessing the properties (рі) and (u5)] Formula (2) is 
simply the expression of this construction of the measure Par... (А) by 
measures Pz, (see [69]). 

If all the distributions Ре, are continuous, then (2) becomes the classical 
formula of the multiplication of densities: 


п 
Рр бп (Xi Xo i Xn) = П Pe, (xx). (4) 


In accordance with the basic theme of this book we are particularly 
interested not in the n-dimensional distribution of the variables £i, &,..., 
En but in the one-dimensional distribution of their sum: 


QS md RS deo 


In the case of independence of all the summands ё, &,..., & the 
variable £41 is independent of the sum 


а=. thot... + 5s 


Hence the calculation of the distribution of the sum £, can be performed 
step by step by means of the following theorem about the distribution of 
the sum of two independent variables: 


THEOREM 2. If the random variabies £ and т are independent of each 
other, then the distribution of their sum 


=t+y 
ts given by the formula 


P, (А) = Í P (A— y) P, (dy). (5) 
R: 
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Here A — y denotes the set of x for which 
x+y €A. 
For the proof of the theorem it is sufficient to note that 
P, (A) =P € +n € A) = Phn (B), 
where B is the set of points (т, y) of the plane R? for which 


x 4- y €A. 
By the theorem of Fubini, 


Py (B) — f Pe (By) P, (dy), 
R! 


where B, is the set of x for which 


(x, у)ЄВ. 
In our case 
By=A—y. 
Therefore (5) follows from (*). 
Since 


(—œ; 2) — y = (— оо; 2—7), 
for 
F, (2) =P, (—оо; г) 
we find from (5) that 
Е, (2) = fR (z — y) dF, (y). 


If the distributions P; and P, are continuous, then (6) becomes 


p) = f xi —y)p, (9) dy. 
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(*) 


(6) 


Digressing from the addition of independent summands, we shall 


call 


P(A) = f P,(A—y) P, (dy), 
R! 


F (г) = f F, (@— y) аР, (у), 


р(г)= fr (2 —) Pa) dy 


compositions of distributions, distribution funetions, and densities and 


denote them by 
P-—P,xP, 
F=oF,x Р». 
P= р ¥ Po 
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This operation is also studied in analysis quite apart from the theory 
of probability. We shall use a mixed method, deriving the various properties 
of the composition of distributions sometimes purely analytically and 
sometimes by taking into consideration the properties of independent 
random variables. This is made possible by introducing for any finite 
set of distributions 


Pu ds Ps 


the distribution in R” 
P—PQXPQX XP. 


As already mentioned in § 5, taking the space R” to be the set U of 
elementary events and putting 


P= Р, 
Ep (Uir шо, ..., Un) = lbs 
we obtain random variables ё, &,..., £i with the joint distribution 


Pg... = P. 
By the very definition of the distribution P these random variables will 
be independent and will have the distributions 
Py, = Py. 
It follows in particular that the composition of distributions is commutative 


and associative.] From the point of view of probability this is obvious. 
A purely analytical proof is not difficult either. 


$8. THE STIELTJES INTEGRAL 


A considerable part of the theory of Lebesgue integral can be transferred 
to integrals of the more general form 


Јо) е (ах), 
А 


where e(A) is а countably additive set function which is real but, in 
contrast to a measure, is capable of taking negative values. 

Countably additive set functions are usually defined on some Borel 
field of sets My, whose unit we shall denote by Uy (see [69]). For any set 
A C Ug, we set 

в: (А) = supe (В), p3(4)= sup [—«(B)], 
BCA BCA 


+ Translator’s note. In the originel the word “distributive” was written for 
“associative.” 
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where the supremum is taken over all sets B C A from My. These new set 
functions p}(A) and u$(A) are always finite, non-negative, and possess the 
properties of outer measures.{ If they are considered only on Wy and Mys, 
the Borel fields of sets measurable with respect to рї and with respect to 
už, then they become measures ш(А) and џи:(А). We have 


Qt, c M,N 96, 
and on Me 
Ф (A) = p, (A) — р, (A). (1) 
By definition, 


ff) т (ах) = f fF) v. @х)— | 7 о) во (ах), (2) 
A А А 


where ш and иг are the measures in the canonical decomposition of the 
function e. Moreover, the integral on the left side exists (by definition) 
if and only if both integrals on the right side exist. For an arbitrary de- 
composition (1) the formula (2) is valid if both integrals on the right side 
exist, but the integral on the left side may exist even when the integrals 
on the right side do not. 

In the one-dimensional case U, is the real line. If the measures и and и» 
satisfy the requirements (u4) and (u5), and if the set A is the half-open 
interval [a;b) (or the interval (— oo Ф) in case a = — ©), the integral (2) 
is written as 


b 
[£949 (х), 
where 
Ф (х) = р (— оо; x). 
Now 
$ (x) = M, (x) — M; (x), 
where 


M, (x) = p, (—оо; х), My (x) = pa (—00; x) 
are monotone functions satisfying 
M, (— со) = М„(— co) = 0, M, (+ оо) = p, (А1), Ma (+ со) = p(R?), 
Hence Ф is of bounded variation over the entire real line, is continuous to 
the left at every point, and has the limiting values 
$(—09)—0, $(-oco)-(R!). 


If the function f(x) is continuous, then the integral (3) can be calculated 
by means of Stieltjes sums, as shown at the end of § 6. 


+ They are called the positive and negative variations of the function ¢(A). 
Their sum, uiCA) + u2(A), is called the total variation of (A). 
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In the present exposition it seems expedient to call an integral 
f £e ж (ах), 
А 


a Lebesgue integral only if o is a measure (i.e., a non-negative countably 
additive set function) and to reserve the name Stieltjes integral for the 
more general integral in which e may take negative values. 


CHAPTER 2 


DISTRIBUTIONS IN К! AND THEIR CHARACTERISTIC 
FUNCTIONS 


$9. WEAK CONVERGENCE OF DISTRIBUTIONS 


We have seen that from the general mathematical point of view a prob- 
ability distribution (or as we shall simply say henceforth, a distribution) 
on the real line R! is the special case of a measure given on №! and possessing 
the properties (u1)-(u5). From the class of such measures distributions 


are singled out by the condition of normalization 
diia p (R) = 1. 
We remind the reader of the following definition in functional analysis. 
Derinirion. If the measures 
Bp Bares ea Bases 


and и are defined in the same metric space U and possess the properties 


(ul)-(u5), then we say that the sequence un converges weakly to y, 
if for any bounded continuous function f(z) defined on U, the following 
relation holds: 


f fene, (аш) > f fen p (du). 
U U 


Applying this definition to distributions in R!, we obtain the type of con- 
vergence of distributions which plays a fundamental role in the greater 
part of this book. Weak convergence of un to и is denoted by the symbol 


En => р. 


We mention at the outset the following proposition, which we need 
further on: 


THEOREM. If un — u, and if the function g(x) is bounded and continuous 
and 


dn(A= f g(x) pn (dx), А04) = f e (а) в (dx), 
А 
then №, — А. ^ 


lor the proof we note that р, = u implies 


Јл) е0) в. (ах) > ffe Cow (42); 
U U 
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for every bounded and continuous function f(x). By Theorem 2 of $ 4, 
this means that 


Јода > лодак). 
U U 


In dealing with one-dimensional distributions we shall frequently 
consider also their corresponding distribution functions 
F(x) = P(— оо; x) 
and instead of P, = P we shall write 
Fr => F. 
In this case a more intuitive idea of the notion of weak convergence may 
be obtained from the following theorem: 


THEOREM 1. For the weak convergence E'n = F each of the following three 

conditions is necessary and sufficient: 

(1) Р(х) — F (x) at every point x which is a continuity point of the distribu- 

tion function F (x). 

(II) F, (£) — F(x) on some set C which is everywhere dense on the real line. 

(III) L(F,,F) — 0, where the distance L(G,F) between two distribution 

functions G and F is defined as the infimum of all h such that for all x 

F(x—h)—h go G(x) < F(x+h)+ 5. 

The distance L(G,F) between distribution functions was introduced 
by P. Lévy. In Fig. 1 is drawn the strip in which the graph of G(x) must 
be located in order to satisfy the inequality L(G, Р) < ^. 
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It is proved іп a quite elementary way that our distance satisfies the 
axioms of a metric: 

(1) L(F,G) = 0 if and only if = С, 

(2) L(F,G) = L(G,F), 

(3) L(F,JI) € L(F,G) + L(G,H). 

We shall carry out the proof of Theorem 1 in the following way. Denot- 
ing by (IV) the assertion of weak convergence Fa = F, we shall prove that 
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the following relations exist among the propositions (I), (II), (III), and 
(IV): 


The assertion (I) — (II) is obvious. Hence we turn at once to the proof 
of the assertion (II) — (ПІ). Take any є > 0 and choose a€C and b€C 
so that 


Р()<%, 1—Е() < 
Subdivide the closed interval [a;b] by the points 
a—ag«au«...«asg—b, 


belonging to C such that the length of each subinterval [a,_,;a:] is less 
than є. Choose N so that for n > N the following inequality is satisfied 
at all points ar: 


| Fn (ax) —F (ax) | X5 . 
We now prove that for every z and n > N 
F (x—e)—e<F,(x) SF (x+ e) +e. (1) 
In the proof we consider several different cases. If 
аку < X L dy, 


then 
Fu (x) <Р, (ay) SF (a) 4-5 & F (x4- 9) J- 7, 
Р, (x) D> Fa (ак 1) >F (x1) — 5 > F(x —9)—5. 
If 
X X do, 
then 
Fy (x) & Fn (а) < Р (a) H5 < E < F (x) +e, 
F (х) > 0> РЕ (а) — 502 Р(х) — 5 
If 
XA 
then 
Fa) « 1«F (а) 5 € FG). 5. 
Fa (x) > Fa (a) > F (а) —-у>1——>Р(х)—е. 
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Since є is arbitrary, our assertion follows from (1), which can be rewritten 


as 
L(F, Fa) <e, 


The same construction proves (II) — (IV). We denote by M an upper 
bound of | f(z) | and pick a€C and bEC so that 
F(a)<e, 1—F (b) <e. 


In the closed interval [a;b] the continuous function f(x) is uniformly con- 
tinuous, hence there exist points a, of C in this interval, 


а==а,< 0 <9<...< а, = 6, 


such that 
| f(x) — f (ax) | <= 
for 
ak LX < ау R=O,1,2,...,5—1. 
Construct the auxiliary function 
flay) for a, x«&ay,,; k= 0, 1, 2,...,5—1, 
f. w=] 0 for х<а, or x; 


Obviously, for any distribution function G(x) 


8—1 


J 409400) = X Ла) 10 (aes 1) — 0 (29. 


Since F,(x) — F(x) as n — œ at the points x = a, we have 


[AGF к) f f (бә dF (х). (2) 
At the same time, for any distribution function G(x) 
Пло) лао (к) = f Co од [40 65 


b oo 
+ ff) — 5.09146 09 + f 1/60 — 7. (9146 (х) 
a b 


< MG (a) +e [G (b) — G (a)] + M [1 — G (b). 


Applying this estimate to С(х) = F(x) and G(x)- F,(r) and noting 
that F,(a) and F,(b) converge to F(a) and F(b), it is easy to show that 
for sufficiently large n 


f feo =. 09146 (x) < QM + 0, 
PIF OA) [GPa C0 «QM T 2) e. 
Together with (2) this gives 
if feo ar, (x) — f feo aF о) < Gar-- 9 


for sufficiently large n. Since є > 0 is arbitrary, our assertion is proved. 
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We shall now prove that (III) — (I). Let zo be а continuity point of 
F(z). Then for every є > 0 there exists ô > 0 such that 


| F(x) — Ё (х) [< 


| x — Хх, | < б. 
Let 
Н = min (e, ё) 
and let n be so large that L(F,,F) < H. It is easy to see that 
F,, (xq) > F (x, — Н) —Н`> F (хо) — 22, 
Fy (x9) < Р(х + H) H- Н< Р (хо) + 26. 
Since є is arbitrary, our assertion is proved. 
VIS 


47 


Tyo у т 
Гіс. 2 
Finally, we shall prove that (IV) — (I). Let zo be a continuity point of 
F(x) and let 
Er= F; 
Take ё > 0 so that for |x — zo | < à 
|F (х) — F (x)| 2, 

and construct the functions (Fig. 2) 

1 for x< xg —95, 
fiG)— for my BSH, 
| 0 for x Z Xp; 

1 for x « Xm 
X — Xo 


fa(x)241— ^ for xa « x « xg +S, 
0 for x> xt 5. 
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It is easy to verify that 
a —5 


f^ (x) dF (x) > [ 1 dF (x) =F (x, —8) > F (ху) — s, 





(3) 
294-5 
[љо ағо) f YaFG) =F (x +8) F (mp) е, 
JA dF < f 1аР„(х) =F, (x), 
ae | (4) 
Јл) ағ, (х) > [ 1 ар, (x) == Р, (ху). | 
For sufficiently large n, 

| fien ar о) — f f ar Co | «s, 

(3) 





|f ^ (x) аР (x) — | fa GO dF (x)| or 


It follows from (3), (4), and (5) that 
F (x9) — 2: < Fy (5) < F (x) + 2e. 

Since є > 0 is arbitrary, the assertion (IV) — (I) is proved. With this the 
proof of Theorem 1 is also completed. 

THEOREM 2. The metric space 9t! of one-dimensional distributions with 
the distance L(F,G) is complete. 

Let the sequence Fi, F», ..., Fn, ... satisfy Cauchy's condition 

І (Р, Pm) > 0 (6) 
as n — oo, m — оо. Pick an everywhere dense set 
Ces xg Noy кужу My ves] 


of points on the real line. Since the values of F,(z,) are bounded, the well- 
known diagonal argument proves the existence of а subsequence 


Р», (x), Fa, (х),..., Fy, (x), КЛ 
which converges at every point х = x.. The limit 
v(x,) = lim Fp, (x5) 
kœ 
is defined on the set C and is a nondecreasing function there. 


Now set 
F(x) = sup v(x,). 
25 cc 
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The function F(x) is defined everywhere on the real line, and is non- 
decreasing and continuous to the left. From (6) it easily follows that 
Е(— оо) = 0 and F(--co)— 1. 

In fact, for any є > 0 there exists an n such that L(F,,F,) < eform 2 n. 
We can find а z such that F,(z) < e. Then for х, < 2 — 6, 

Fay Xa) < Fa(2) - < 2e for ny п 
and therefore 

vU (x) < 2e. 

Since є > 0 is arbitrary, it follows from this that F(— оо) = 0. It is simi- 
larly proved that F(+0) = 1. 


It is easy to see that F,,(x) converges to F(x) at every continuity 
point of the latter function. Therefore 


lim L (Fay Е) > 0. 
k- 
It follows from this, together with (6), that lim L(F,,F) — 0. 
п 0 


THEOREM 3. In order that the set S of distributions be conditionally compact t 
in St!, dt is necessary and sufficient that the conditions 

F(x)—0 for x2 — оо, 

F(x)>1 for хә -4-0 
be satisfied uniformly in S. 


Proof. Let a sequence of distribution functions F,(r) in S be given. 
Just as in the proof of Theorem 2, we shall pick a set C everywhere dense 
on the real line, and a subsequence 


Fa (x), Fagl) +05 Pay (39, 955 
converging to some nondecreasing, left-continuous function F(x) at every 
continuity point of the latter. The condition of the theorem guarantees 
that the limit function F(x) satisfies the requirement 
Е (— оо) = 0, F(+o)=—1, 
that is, F(z) is a distribution function. 


The definition of weak convergence at the beginning of this chapter was 
given for measures » which need not satisfy the condition of normalization 


p (К!) = 1, 
characterizing a probability distribution. As was mentioned in $8, to 
every measure и in k! there corresponds a nondecreasing, left-continuous 


function 
М (x) =» (— оо; x), 


M(— ос) =0, M(+ со) = p (RI). 


t Translator's note. The adverb “conditionally” is added. Note that a limit 
distribution need not belong to S. 
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In accordance with § 8, we shall write 
[лов а) = | s(x) 48 (х) 
m 


and instead of the weak convergence un == и we shall talk about the weak 
convergence of the corresponding functions M, — M. 

The reader can easily verify that Theorems 1, 2, 3 remain valid in the 
following form (see [69]). 


THEOREM 1 bis. For the weak convergence М, = M each of the three 

following conditions is necessary and sufficient: 

(Т) M.(x) —^ M(x) at every point x which is a continuity point of the 

function M (x) and M,(+0)—- M(+0). 

(П) M,(x) — M (x) on some set C everywhere dense on the real line.t 

(ПІ) L(M,,M) — 0, where the distance L(M,,M2) between two functions 

М; and M: is defined as the infimum of all h such that for all x 
M,(«—h) —hA<Myg(*) CM, (s 4- 1) 4- 5. 


THEOREM 2 bis. The space W of functions М(х) with the distance 
L(M,M3) ts complete. 


THEOREM 3 bis. In order that the set S be conditionally compact in $0, 
it is necessary and sufficient that the limiting values M (4-00) be bounded 
and that the conditions 

M(x) > 0 for x > — oo, 

M(x) > M(+o) for x >+ co 


be satisfied uniformly on the set S. 


$10. Types or DISTRIBUTIONS 


It is often useful to consider, together with the.random variable £, 
another random variable » connected with £ by a linear relation 
7 == at +b 
(a > 0 and b arbitrary). Geometrically this transformation means a change 
of scale and of origin. It is easily seen that the distribution functions of 
£ and 7 are connected by the equation 
Fy (х) = Fs (ax -|- b). 
As an illustration, suppose that as the solution of a certain problem we 
obtain the normal distribution 
_ (4-—@)% 


` 1 ET 
F(x) = 99 dz, 
(x) TEM 





t Translator’s note. It is necessary to add the condition Ma( +2) > М(+ >). 
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Now. for numerical computations we are not going to make a special table 
for this function, but will utilize the available tables of the function 


| f -Ë 
Ф(ху=-== [е 2 dz, 


—% 


noticing that F(z) and Ф(т) are connected by the equation 





Р(х) = (74). 


In this connection it is natural to make use of the following concept. 


DEFINITION. The distribution functions Fi(r) and Р(х) belong to the 
same буре. if for some constants a > 0 and b the following equation 
holds: 


F, (x)= Р, (ax +d), 


or. what is the same, 


Since the property of belonging to the same type is symmetrical and 
transitive, the totality of distribution functions falls into mutually disjoint 
types. 

It is easy to sce that all normal laws of distribution form one type, the 
normal type: all improper distribution functions form the improper type. 

The types of distribution functions other than the improper type are 
called proper. 


THEOREM 1. If the sequence (F.(x)] of distribution functions converges 
asn— = {ә a proper distribution function F(x), then for any choice of 
the constants a, > 0 and b, the sequence {F,(a,x + b,)] can converge to a 
proper distribution only if this is of the same type as F(z). 

Proof. Suppose that as n — x, F lx) = F(x) and Falante + bn) = G(z). 


and that both F and G are proper. We must prove that there exist a > 0 
and b such that 


G (x) = F (axb). (1) 

First of all. pick a sequence of integers ny < na <°- < ng < +++ such 

that the limits lim a., = aand lim bn, = b(0 a < +x., -x <b< х) 
hoe km 


exist. 
Henceforth we shall consider only this sequence of indices and, to 
simplify the notation. we assume ‘without loss of generality) that 
lim ay — a. lim b; = b. 


k => со k->œ 


§ 10] TYPES OF DISTRIBUTIONS 41 
Let us prove that 0 < a < +оо. Suppose that а= +æ. Denote by u 
the supremum of the number = for which 
lim (a,x + bn) < + oo. 
п со 
Forv<2z<u 


lim (a,v-+ bn) « lim (v —x)a,-+ lim (a,x + b,), 
n-> со 


n>m n> со 
hence by the assumptions made above we have, for every v < u, 


lim (a4v + 6,) = — co. 


7 -> со 


Consequently, G(v) = O for» < u. Forv > u, 


lim (a,v + bn) = co, 
п > со 


hence С (0) = 1 for v > u. 

The assumption that а = o» contradicts the fact that G(x) is proper, 
hence it must be rejected. 

It follows readily that 6 must also be finite. In fact, the assumptions 


lim (a,x + bn) = + co, lim (a,x + bn) = — oo 

lead in the first case to G(x) = 1, and in the second to G(x) = 0. 

Now suppose that a = 0. In this case for every z and e > 0 

b—s<ca,x+b, cote 
for sufficiently large n. Hence 
Fy (b — €) < Fn (anx F bn) < Fn (b + 8), 
and if є is chosen so that the function F(z} is continuous at the points 
b — eandb + e, then 
F(b—e) < G(x) « F(O-+ €). 

Since x is arbitrary, we must have 


F(b—e) —0, F(b2- є) = 1, 


that is, F(x) is improper, which contradicts the condition of the theorem. 


Finally, let x be chosen so that F(x) is continuous at the point az + b 
and G(x) is continuous at the point x. Then, on the one hand, 


lim F, (apx + 54) = С (x), 
л - со 


and оп the other hand, 
lim P, (a,x F ba) = F (ax + D). 
n > oo 
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The last equation requires clarification. Since 


n (a,x F bn) = ax + b, 


lir 
n- с 
for sufficiently large n 
ах 6—8 < ах 0, «ax bte, 
where є > 0 is chosen so that the function F is continuous at the points 
ат + b — «and ах + b + є. Hence 
Fa (ax ~ b — e) < Fn (anx + bn) < Fn (ax +b +e) 
and in the limit as n — oo 
F (ax -b — з) < lim F, (a,x + bn) <limF, (а„х +b) F(ax--b +2). 


п >w 
Since ax + b is a continuity point of F(x), and since e is arbitrary, we 
obtain (1) from the preceding inequality. Q.E.D. 
In the following we shall need the next proposition [40]. 


THEOREM 2. For a sequence of distribution functions F(x) the relations 
Fy (bax + a4) => Р(х), (2) 
F, (Bax + а„) => F (x), (3) 
asn— oo, where b, > 0, Bn > 0, Gn, an are real constants and F(x) isa 
proper distribution function, are satisfied simultaneously if and only if 


asn— о. 
Proof. We shall first prove that (2) and (4) imply (8). 


Let zi, x, and z be continuity points of the function F(x), and let 
zy <x < a. Then by (4) 


а, — а 
X1 <fa x7 — b. п < Хо, 
(i 


bx, an < Bax F an < Pax F On 
for sufficiently large n. Hence we conclude that 
Е, (box, аа) < Fp (Bax + an) < Fa (Onto ал). 
In the limit this gives 
F (xj) & lim Р, (3,4 + an) < lim F, (Bax + a4) < Р(х). 
n3 n -> со 


As r,— zr and 2з э х, the chain of inequalities just written down 
becomes (3). 
We now prove that, conversely, (2) and (3) imply (4). For this purpose 


put В, = Bn Aye ЕЕ G,(x)- Ё„(б„х + an). It is easy to verify 
b, b, 
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that in this notation the relations (2) and (3) are transformed, as n — æ, 
into 
G, (x) > F (x), 
Gn (Bax + An) > F (x). 
As in the proof of the preceding theorem, pick a sequence ny < њ ·· · 
< ng < + - such that as k — œ 


An, > A, Bh, > B. 


The argument used there proves that A and B must be finite numbers 
(B > 0) and that the equation 


F (x) = F (Вх + A) (5) 
must hold. Suppose that В ғ 1 and consider the two possible cases. 
Case В < 1. n-fold application of Eq. (5) leads to the relation 
Е (х) = F(B"x-J- A(1--B--...--B^71). 


Since n is arbitrary and lim В” = 0, we conclude from this that 


no 


for any x 
F(x) =F (45). 


For a distribution function this equation is impossible. 
Case B > 1 reduces to the preceding, since (5) can be written as 


F(x) =F (gz х—%). 


Thus we must have B = 1. 
Now if A x 0, then (5) leads to 


F (x) =F (x + nA), 
where n is arbitrary. Hence we find that for every x 


F (— оо) тош! (x + nA) = F (x) 


= lim F(x-+nA)=F (+ оо). 


An > + co 


Since these equations are impossible for a distribution function, it follows 
that A = 0. 
Thus it is proved that as k — oo 


An, э 0, By, 1; 


We shall prove, moreover, that 
A,20, B,51, (n-co) (6) 
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Suppose the contrary; then there exists a number à > 0 and a sequence 
of indices лу such that at least one of the inequalities 
Пт |В —1] 8, lim] A,» |>3 (7) 
k—co P poo k 
holds. 
Without loss of generality we may choose this sequence of indices so 
that 


Ag eA Bis! 


ask— о. 
From the above it is clear that we must have 4’= 0, B’= 1. But these 
equations contradict (7). Therefore (6), and so also (4), is proved. 


$11. THe DEFINITION AND THE SIMPLEST PROPERTIES 
OF THE CHARACTERISTIC FUNCTION 


Tt is well known from elementary courses in the theory of probability 
that the variance 


0% = М (E— Mt. 


is of great service in the study of sums of independent random variables. 
The use of variances (for example in the proof of Chebyshev's theorem) is 
based upon the fundamental property that the variance of a sum of 
independent variables £ is additive: | 


D (5, fet... 5) =D%, + D% +... H Dtp 


This property of the variance of the sum of independent variables is 
analogous to the property of the mathematical expectation of the sum 


меъ) = ME Mit ME, 
(for mathematical expectations the requirement of independencé of the 
summands is superfluous). It is natural to raise the general question of 


finding further characteristics A£ of random variables which would possess 
the property of additivity, 


Af t bi t iS Ah t Ap tooo + At, 


for independent summands. Success in this direction would indeed be 
complete, should the characteristic have a definite finite value for all 
random variables (in contrast with the mathematical expectation and the 
variance, the existence of which is an additional restriction). 

The answer to this question is given by introducing the concept of the 
characteristic function of the random variable (see also § 15): 


A (0 = M eit = fet dF, (x). (1) 
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The carrying over of the concepts of the mathematical expectation and 
the integral to functions with complex values, which enter here, presents 
no difficulties: if w(x) = u(r) + iv(r), then set 


Го о) в (ах) = fu (х) рак) Hi f v(x) (ах). 
А А А 
Since 
| ei | = 1, (2) 


the function f(t) is defined for all real t for every distribution function 
F(x). We shall denote a distribution function and its corresponding 
characteristic function by a capital letter and the corresponding small 
letter. 


THEOREM 1. 4 characteristic function is uniformly continuous on the 
whole line and satisfies the conditions 


FO) = 1, |f()|-z1 (оо < t < + оо). (3) 


Proof. (3) follows immediately from the definition of a characteristic 
function (1). It remains to prove the uniform continuity of the function 
f(x). For this purpose we introduce an inequality which will be useful also 
in what follows, namely, if 


F(A)—F(—A) > 1—e, 














then 
FEV —f()| А [27 — 0’ | 4-22. (4) 
To prove this we note that for real 2’ and 2” the following inequalities hold: 
| et?” — ei |xz|z^ —z'|, since | ы Бы 
е — e | 2. 
Therefore 
ло | + [fo fete ete dF 
[т|сА |т|>А 
< f | it”x — it'x| dF (x) + 2(F (— 4) +1 — F (4)1 
Irisa 
Alt 4 2e. Q.E.D. 


THEOREM 2. If n = аё + b, where a and b are constants, then the character- 
istic functions of the random variables £ and n are connected by the equation 


fa (£) = л (ar) ef, 


and if a > 0, then their distribution functions satisfy the relation 


FQ) =R *) | 
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Proof. In fact, 
Л (f) = Мей" = Ме (9+0) — eitb Metta — eitb f, (at), 
and for a > 0, 
F,(x) = P(at4-b < x) =P (< Ec \ = FE. 


The advantages of using characteristic functions are based mainly on 
their next property: 

THEOREM 3. The characteristic function of the sum of two independent 

random variables is the product of the characteristic functions of the sum- 


mands. 

















Proof. Obviously, together with £ and т, the random variables е“ and ей 
are also independent. Therefore * 


Ме! (+) — M (ей. eft) = Mett . Ме". 


ConoLLany 1. Jf #=5 + £x +--+ +E, and if each summand & is 
independent of the sum of the preceding summands & + E + +++ + £a, 
then the characteristic function of Е is the product of the characteristic 
functions of the summands. 


COROLLARY 2. The squared modulus of a characteristic function ts a 
characteristic function. 





Proof. Let £ and т be independent and have the same law of distribution 
with the characteristic function f(t). Then, by Theorem 2, 


f.Q=f(—)=f( 
and by Theorem 3, 
А (0) =Л (9). /-„(® fO 7 (0 = |70) 1°. 
Q.E.D 


EXAMPLE 1. The random variable is distributed according to the normal 
law with mathematical expectation a and variance o°. The characteristic 
function of £ is 











(= [ et — — UA ах 
qe == | ейт е 24? . 
: e Y 2x 
Substituting 
x ; 
2 = — its 

we reduce e(t) to the form 

423° , oo — its 

f) ln m eo T iot 1 ЕЕ 
ei) Vix | e ? dz; 
-—œ— itc 


* The mathematical expectation of the product of independent variables is 
the product of the mathematical expectations. 
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Now, it is known that for every real a, 


оо — ta 
21 
РЕ edz = 2T, 
— со — 14 
consequently 
tte— — ts? 
p(t) =e 


ExAMPLE 2. The random variable t takes only non-negative integral 
values and 
AK 
Р{[ =} =e e072, Jen 
where А > 0 is a constant (Poisson's law). 
The characteristic function of the random variable is 


со оо k 
f (t) = Meit — У ейкрї = k | = У ete 


k=0 k=0 





со 
CY (Хе#)К it ; 
—e- = о-\ gel! — pd(eit-1) 
=e У i с^ ее e : 
k=0 
Moreover it is easy to see that 
МЕ =A; D?t = А. 
IxaMPLE 3. The random variable и is the number of occurrences of the 
event A in n independent trials, in each of which the probability of occur- 


rence of A is p. 
The random variable и can be represented as the sum 


pog due. 


of n independent random variables, each of which takes only the two 
values 0 and 1, with the probabilities q= 1 — p and p respectively. The 
random variable ик takes the value 1 if the event А occurs in the Ath trial, 
and the value 0 if the event .1 does not occur in the Ath trial. 

The characteristic function of the variable ш 1s 


fa (0) = Ме” == ё0 q eitp = q -+ pelt 


According to Theorem 3 the characteristic function of the variable и is 


7 
100 — [T 7.0 — G+ pet. 
Let us also find the characteristic function of the variable 


n= (и — np)/ V/npq. 
By Theorem 2 it is 


ie” EA | )- eV ш 4 ре! vu 


пра 


они г; 
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'To conclude this section we remark that the definition of the character- 
istic function, in the form 


РО) = [ ettz (dx) = f eitz dd (x) 
В! 
applies also to an arbitrary countably additive set function Ф(А) and its 
corresponding (see § 8) function 
Ф (x) = p(— co; x). 
Besides, Theorem 3 remains valid in the following form: 
THEOREM 3 bis. If 
P= Pit fo P= Di Pa, (4) 
then for all t 
FO =A (felt). (5) 
Here (4) stands for 
P(A) = f Pı (A — y) Pa (d), 


R' 
P (x) = Ф, (ху) do, (у). 


The proof of this theorem can be found in texts on Fourier transforms. 
We can, however, easily deduce it from Theorem 3. To this end it suffices 
to note that every countably additive set function y(A) is representable 
in the form 


(А) = aP (A) + bQ (A), 


where P and Q are distributions and a and b are constants (cf. § 8). 


$12. THE INVERSION FORMULA AND THE UNIQUENESS THEOREM 


We shall now prove that the correspondence established in § 11 between 
one-dimensional distributions and characteristic functions is one-to-one. 


THEOREM 1. Let f(t) and F(x) be the characteristic function and distribu- 
tion function of the random variable Е. If ху and x, are continuity points 
of the function F(x), then * 


c 
1 -ilz tte. 
F (x4) — Р (х0 = 5 lim ү e 


f (4) dt. (1) 


/ 
c0 "c 





* The equation (1) bears the name inversion formula. 
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Proof. For the sake of definiteness ict xı < x». Set 


19 


c 
] e-m milta 


— e ү 
с 5 m f (0 dt. 
-c 


Substituting here for f(t) its expression in terms of F(x) and changing the 
order of integration, we easily find 


+e 
ett 02—41) __ е (2—0) 
nl [jecur gie) 
2n it 


—c 


C 
„| ПЕ за | arar (a) 
a t t f : 
0 


Now, for every a and c 








с ас 
1 sinat. „| |1 sins 5. m e» 
НЕБЕ y а, (2) 
0 0 
and for c — oo 
1 А, 
c =, fa>0 
1 2 2 , , 
= | dt => | (3) 
‘ ч if a < 0, 


Also, this approach to the limit is uniform with respect to а in every 
domain a > 6 > 0 (respectively a < —ô < 0). 


Now choose ô so small that x; + 6 < x» — 6 and write J, as the sum 
of five Integrals 


mnl r5 25-5 — ay th 
apt p Mes zin. nd FQ) 
v vt ô 02-8 rb 


where 
р ) 
(с, г) Xa X) = - | [e een = M LIUM == 22 | dt. 
From (3) it follows that as c — oo 


y(c, 2; ху, Xa) — 0 for г< x,—95 and z> x+ 6 
and 


Q(c, г; Xy, X9) — 1 for x, +8 < < x, — B. 
both limits being uniform with respect to z. In the intervals (ài — ô, i + ô) 


and (rs — ô, хә + 6) we know that 


ly (c, 2; Хх, x |< 2. 
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From the relations obtained above we conclude that for every ô > 0 


lim le = F (x, —8) — F (x, +8) +R (8, x,, x3), (4) 


where 
[R(8, x4, x | <2 {F (x, +8) — F (x, —8) + F (x, 4-8) — F (x, —8)). 


The left side of (4) dues not depend on 6, and the limit on the right side 
as 6 tends to zero is F(z») — F(x), by the choice of the points x and zs. 
Q.E.D. 


THEOREM 2. A distribution function is uniquely determined by its char- 
acteristic function. 


Proof. From Theorem 1 it follows immediately that at every continuity 
point т of the function F(x) the following formula applies: 


е 
F(x) = lim lim | e 779 — £717 F4 gt, (5) 
jy —o0 с> со it 
—C 

where the limit in y is taken over the set of points y which are continuity 
points of F(y). 

Theorems 1 and 2 remain valid if F(x) ts an arbitrary left-continuous 
function of bounded variation, subject to the condition Е(— оо) = 0. In fact, 
such a function can be represented in the form 


F(x) —a,F, (x) + а„Р» (x), 
where F, and ЕЁ» are distribution functions.* For the characteristic functions 
we obtain in an obvious way the corresponding equation 
f (x) = a, f, (x) + afa (x). 


From '(1) and (5), applied to Ё, and F; separately, we obtain immediately 
(1) and (5) for F. 
We consider some examples of the application of the last theorem. 








EXAMPLE 1. If the independent random variables £ and & are normally 
distributed, then their sum £ = £ + & is also normally distributed. 

In fact, if МА = ai; р = of; Mb = a» Р, = o}, then the charac- 
teristic functions of the variables & and £» are respectively 


ў 1 : 1 
ta,t— — oup fa, — — 61212 


f,(O=e s › fo (t) =e ? 
By Theorem 3 of $ 11 the characteristic function f(é) of the sum 
& = + is 


| : + 2) 72 
fO=f, (t) : fo (f) - "d (ai 82-7 > (e, уа 


* See $ 8. 
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This is the characteristic function of the normal law with the mathematical 
expectation a = a, + a» and the variance e? = сї + o3. On the basis of the 
uniqueness theorem we conclude that the distribution function of the 
random variable £ is normal. 

It is interesting to mention that the converse proposition is also true: 
If the sum of two independent random variables is normally distributed 
then each summand is normally distributed.* 


EXAMPLE 2. The independent random variables £y and & take only non- 
negative integral values, and 


Мет“ 


Р{ = 2) = Р! 


Мете 
k! 











and P {f =k} = 


In Example 2 of the preceding section we found that the characteristic 
functions of the variables & and £ are respectively 


Л, (t) = eset 
Л@ = exe", 


The characteristic function of the sum £ = & + & is 
F= f, (t) - fa (A = e et» 


i.e., the characteristic function of a certain Poisson law. According to 
the uniqueness theorem, the variable £ is distributed according to the 
Poisson law with the parameter \ = № + №: 


уке 02430 
P (=) = fM t (k> 0). 


D. A. Raikov [88] proved the converse proposition: if the sum of inde- 
pendent random variables is distributed according to a Poisson law then 


each of them 1з distributed according to a Poisson law. 





and only tf the distribution function of the random variable £ 1s symmetrical, 
i.e., of for every x the following equation holds: 


F(x) 21— F(— х 0). 








Let the distribution function be symmetrieal. Then 1 


* H. Cramér [20]. 
t Translator's note. The following formula is incorrect, as is seen by taking 
F(z) to be 0 for z € 0 and 1 for z > 0. The correct formula reads 


fit) = F(04-) — F(9S—) + ae cos tz dF(r). 
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0 


Rn f ez dF (x)= f tear (х) J eits dF (x) 


— со 


ET EU 


| e-itsd(1 — F(— s+ f enar (x) 


e- «ао s fe dF (x)=? fest dF (x). 


To prove the converse, we consider the random variable n = —£. The 
distribution function of the variable 1s 
P(a«x| Pt» а} —1— FC x9. 
By Theorem 2 of $ 11 the characteristic functions of the variables £ and 
n are connected by the relation 


fy — AO. 
Since f(t) is real, so also is f;(f) = /@) and thus 
fat) — f 
From the uniqueness theorem we conclude that the distribution functions 
of the variables £ and coincide, 1.e., 
Е (х) = 1 —F(— x+ 0), 
9.Е.р. 


$18. CONTINUITY OF THE CORRESPONDENCE BETWEEN 
DISTRIBUTION AND CHARACTERISTIC FUNCTIONS 


In $9 it was established that the totality of one-dimensional distribu- 
tions with the distance L(F,G) forms a complete metric space. The con- 
vergence Fa = F in the sense 

L(F4, Р) 50 


was defined in $ 9 in several equivalent ways. According to the uniqueness 
theorem there is a one-to-one correspondence between distribution func- 
tions F and their characteristic functions, 


rage f eitz dF (x). 

Therefore, putting - 
| e, 8) = L(F, G), 

where F and G are the distribution functions corresponding to the char- 


acteristic functions f and g, we can at once turn the totality of charaeteristic 
functions into a metric space. The convergence f, — f in the sense 


P (fs, f) 9, 
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will then obviously be equivalent to P, = F for the corresponding distribu- 
tion functions. 

However it is natural to inquire as to the meaning of such a convergence 
Л = f from the standpoint of the properties of the functions f, and f 
themselves. The answer is given by the following theorem, which has 
fundamental importance for all that follows: 


| THEOREM 1, LI f ГАО) and f(t) are characteristic functions of the distributions 
P,(A) and P(A) and if P, => P, then 


fr) > ft) 


| 
f as n — oo uniformly in every bounded interval L ST. 





TuronEM 2. If f.(t) is the characteristic function of the distribution 

P,(A) and f,(t) converges as n — co for all t to a continuous function 
f(D then the distribution P,(A) converges weakly to a distribution P(A) 
with the characteristic function f(t). 


From Theorems 1 and 2 it follows that the convergence mw = fi can he 
defined as uniform convergence in every finite interval. But it can also 
be defined as convergence at every point ¢ without any requirement of 
uniformity: within the class of characteristic functions the two definitions are 
equivalent.* 

Proof of Theorem 1. Let Р, = P. From the definition of weak con- 
vergence it follows that /,(t) — f(t) for every t. Since P, — P, the Р, 
forms а conditionally compact set in the sense of weak convergence. 
From Theorem 3 of § 9 and (4) of § 11 we conclude that the corresponding 
characteristic functions f,(¢) are equi-continuous. Moreover, the f,(¢) are 
uniformly bounded. Therefore the convergence f,(t) — f(t) as n — oo must 
be uniform in every finite interval, since Arzela's theorem is applicable to 
{a(t} in any finite interval.t 

Before proceeding to the proof of Theorem 2, let us introduce an in- 
equality. 


Let 1 > 0, X > 0, and ET < 1. Then 
TX 








1 pu 1 
эг f ros|- 


P[|—X;-X)m———————. (1) 
aasa 


tX 





* For an arbitrary sequence of functions f,(t) we define the convergence 
fa (D) = f(t) to be uniform convergence in every finite interval of t. 

+ Translator’s note. If the functions fa(t) are equi-continuous (in every finite 
interval) and if fa(é) converges for every £ to a continuous function, then it is almost 
trivial to prove that the convergence is uniform in every finite interval. Neither the 
uniform boundedness of f,(é) nor the deeper theorem of Arzela is needed. In fact, 
a direct proof of Theorem 1 is very simple (sec, e.g., [21]). 
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In particular, 


»(-t jaf roai o 
Proof. m 
j 


1 (2 
Эт 


—"v 


+r 
l itr l a 
=| (fie dt) aP |< | f |+ | d sin teap] 
—t Iz| «X Iz|X 
1 А y 
<P{—X; +Х\-+-уа—Р{—Х; +.NX}) (3) 
(in the last estimates we use the inequalities 
le. 


It is easy to see that (3) is equivalent to (1). 
Now let the conditions of Theorem 2 be satisfied. Then for every e > 0 
we can find à т > 0 such that 


| itr \ 








us 
xj/o«|- 


sin tx 
тх 


sin tx 
тх 





<=) 


акі“ 














| froe—ifes 
2t 2° 


Consequently, 











dx + + 
=] љ0а—1|< i 00—794} x [/02—1| 


+t 
санан ТАО 


But f,(t) > f(t) апа |f,(t) — f(t)| € 2. Therefore by Lebesgue's theorem, 
for fixed e > 0 and 7 > 0 and for n > n(e7), 


Noting that P,(— X; +X) does not decrease with increasing X, we con- 
clude from (2) and Theorem З of $ 9 that the set of distributions (Р, | is 
conditionally compact. 

For every convergent subsequence {Р„ | the function fa, converges to 
some continuous function f(t), by Theorem 1. 

Then f(t) must coincide with f(t). Consequently, the conditionally com- 
pact set {P,} has a unique limit point, which means that 


P,=>P (n> оо). 
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As an example of ће use of Theorems 1 and 2 let us prove the integral 
theorem of de Moivre-Laplace. 

In Example З of $ 11 we found the characteristic function f,(t) of the 
random variable n = (и — np)/Wnpq: 


—aV 2 itl/ £V 
f.) m ae V op Vs) 
Using the Taylor series expansion of e?, we find that 
—it V 2р2 “ V L p 
qe ng + pe пр — bd + R»), 
where 
9 VL (Lit v7? рак а (— p 
R,=2 > — (— cle Mio le i 
7 У ' (ут) (р9)% 
Аз n — о 
Ra = 0, 


uniformly in every finite interval of {, hence as n — oo 
1* 


By Theorem 2 it follows from this that for every x 


P Ez s) o D | Fas 








as n o. 
Obviously, this relation is equivalent to the usual formulation of the 
integral theorem of de Moivre-Laplace. 
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In the sequel we shall need several simple properties of the characteri:tic 
function; we proceed to derive them now. 


THEOREM 1. 7f f(t) is a characteristic function then for every t 
1 — Re f (20) < 4 (I — Re (0). 
Proof. In fact 
Re f (t) = f cos tx dF (x) 
and consequently 
1 — Re f (2) = f (4 — cos2 xt) dF (x) — 2 Í sin?xt dF (x) 


= 2 [а — cos xf) (1 -+ cos xt) dF (x) <4 f (1 —cos.xt) dF (x) 
= 4 (1 — Re f (£). 
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We note that if f(f) is real the inequality just proved reduces to 
1 —/ (2t) < 4(1.— fd), 


whence for an arbitrary characteristic function v(t), by Corollary 2 of 
Theorem 3, $ 11, we obtain 


1 — |v (2AP « 4 (1— |v (0) P). (1) 


THEOREM 2. If f(t) is a characteristic function and if for some sequence 
11, b, +++ converging to 0, 


If (&)1=1, 
then there exists a real number a such that 
FO — et, 


Hence f(t) is the characteristic function of an improper distribution. 
Proof. Suppose the contrary, that 
f (0 = [ ez aF (x), 
where Р(х) is a proper distribution function. By hypothesis, 
fy) =e" 
where X, is a real number; hence 
1— f (e r= [е 77 Mara. 
From this we conclude that 
[ (1 — cos (t,x — Ар)) dF (x) = 0. 


To satisfy this equation it is obviously necessary that the function F(z) 
be constant in every interval of xin which cos (£ — А) is different from 
one. In other words, F(x) can increase only at points of the form 

я | 2ns— Ak 

Meer 





where s is an integer. Since F(x) is a proper law, there exist at least two 
points of increase x, and zs of F(x). According to the above, 


Xy — А = 215), Хоб — Ay = 285, 
where s; and s: are distinct integers. Hence 
| (x, — Xa) &| = |2т(5, — $9) |> 2n. 


By the condition of the theorem we can choose t, as small as we please, 
so that the inequality obtained above leads to a contradiction. Thus F(z) 
can increase only at one point, and the theorem is proved. 

Application 1. The condition of the theorem is certainly satisfied if 
|f()| = 1 in some interval 0 < t € a (a > 0). 


$14] SOME SPECIAL THEOREMS ABOUT CHARACTERISTIC FUNCTIONS 57 


Application 2. If f(t.) = 1 for some sequence t; converging to 0 then 
f(t) = 1, so that 
0 for x <0, 


F(x)25e(x) = 
(9) (x) ү for x > 0. 
THEOREM 3. In order that for some sequence of constants a, we have 
Fp (x — an) => d (х) (2) 
asn— oo, 31 is necessary and sufficient that as п — oo 
[fn (t) | => 1. (3) 

(Here F,(r) denotes a distribution function and f,(é) its characteristic 
function.) 

Proof. The necessity of the condition of the theorem is almost obvious. 
In fact, by Theorem 2 of § 11 the characteristic function of the distribution 
Е„(х — an) is 

eM у, (t). 

Hence the condition 

F,(x— a4) => E (x) 
implies (by Theorem 1, § 13) 
SeA S 
and so 
| fn (t)| => 1. 

We shall now prove the converse proposition. From (3) it follows 

that as n — oo 


I fn (4) P => 1. 


In other words, the characteristic function of the difference of two inde- 
pendent random variables £ and nn, both distributed according to the 
law Г, (х), converges to the characteristic function of the law e(x) as n — co. 
This means that for every e > 0 


Píli—nlzmt:)-0. (4) 
Pick a number o, so that 
Р «а, > 5 >P lin < ani. 
Then for every e > 0 
Р {5 — т> | = Р { (5 — an) — (t4 — 94) E} 
2. Р (Е, — а, 226, т а 0 = Р {7 titel P( is < } 
SZ Fa (а Б). (5) 


In exactly the same way, 


P(t,— е | Р, (ва е). 
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Hence by (4) and (5), for every e > 0 as n — œ 


1 — F4 (tnt E) + Fan —8) <2P{|&,—2,] > е } > 0. 
This relation is obviously equivalent to (2). 


THEOREM 4. If for some sequence of integers т < т L'e KnL: 
к 0) => fu) (6) 


as k — оо, where f(t) is some continuous function and the f.(t)(k = 1, 2,...) 
are characteristic functions, then 


fy() =>! (7) 
аз k — ©, so that 
Р, (х) => e (x). 
Proof. Since f(0) = 1 and f(t) is continuous, there exists a > 0 such that 
{/(0|>0 for |t| «a. 


From (6) it is clear that in this interval 


fx(t) — 1. 


From this and the inequality in Theorem 1 of this section 
1 — Re fr (20) << 4 (1— Re fy (t), 
we conclude that 
Re f, (20) — 1 fo |t|<a, 
and therefore 
fx (t) > 1 for |t|< 2a. 


The possibility of doubling any interval in which f(t) — 1 clearly yields 
the conclusion of the theorem. 

We shall call a discrete distribution of a random variable a lattice 
distribution if there exist numbers a and h > 0 such that every possible 
value of & can be represented in the form a + kh, where k runs through 
integral values (not necessarily all). We shall call the number h a span of the 
distribution. 

Many important distributions in the theory of probability belong to the 
class of lattice distributions (for example, the Bernoulli distribution, the 
Poisson distribution). 

If it is impossible to represent all the possible values of £ in the form 
b + kh, for some b and some Л, > h, then we shall say that A is a maximum 
span of the distribution. 

The conditions for a maximum span of a distribution can be expressed 
in other terms. Namely, a span k will be mazimum if and only if one is 
the greatest common divisor of the pairwise differences of the possible 
values of £, divided by h. A little later we shall give a third condition for a 
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span to be maximum. Now we proceed to establish the following charac- 
teristic property of lattice distributions. 


THEOREM 5. In order that the random variable Е have a lattice distribution, 
й is necessary and sufficient that for some nonzero value of the argument 
the modulus of the characteristic function of Е be equal to one. 





Proof. 'The necessity of the condition is proved at once by calculation. 
Let £ have a lattice distribution and 


рк==Р { E=a-+ kh |. 
Then the characteristic function of £ is 


f (t) — eit У eit p, (8) 


The second factor is a periodic function with the period 2. Since /(0) = 1, 


it is evident that 





Now suppose that for to ~ 0 
|f (fo) — 1. 
In other words, we suppose that for some real a 
f (t) == gh, 
This equation can be written out as 
f &*9-9aFG) =1, 
whence it follows that 
[ cos f(x — a) dF (x) = 1. 


For this equation to hold, it is necessary that the function F(x) be constant 
everywhere with the exception of those x for which 


cos 4 (x — a) = 1. 
All z satisfying the last equation have the form 
2n 
x—a--k& Т? (9) 
where К is an integer. Q.E.D. 
From the theorem just proved we easily deduce 


Corouuary 1. If the characteristic function f(t) is such that for two incom- 
mensurable values of the argument to and t, the equations 


Lf (l= 1 and If (£)] 21 
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both hold, then 


Le 


0 |= 1. 


In particular, if the modulus of the characteristic function f(t) ts equal to 
опе in any intervala € t < b,a < b, then it is equal to one for all values of t. 


Theorem 5 enables us to formulate the following important result. 


ConoLLany 2. А span h of the distribution 15 maximum if and only if 
the modulus of the characteristic function of E is less than one in the interval 
0 < |t| < 2r/h and equal to one for t= 2r/h. 


Proof. If 
[Ff (ty) |= 1 


for some th, 0 < || < 2r/h, then according to (9) the number 27/|¢,| would 
be a span of the distribution. But since by hypothesis 


h cannot be a maximum span. 

From this we conclude that for every lattice distribution and for every 
€ > 0, we can find a c > 0 [c= c(e)] such that if h is a maximum span 
2т 
ze ae 


then for e < |t] < x 


[f @)|< е—е. 


For later purposes we must deduce two elementary formulas. Multiply 
both sides of the equation (8) by e-?*t-*'^ (r an integer) and integrate from 
— T toT, Sin 

j 9j Since 
0 for kÆr, 


n 

h 
f ett k—7)^ dt = 
T 
h 


— = for =; 
we obtain аз a result 
, E 
Pr= зу | 70 еа, (10) 


h 


This equation enables us to write the inversion formula for lattice 
distributions in a somewhat different form from the one we had before. 
Indced, putting 


x =a + mh — + h, хаап (п >), 
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we shall prove that 





F 
е “7а 
F (ха) — F(x = 3 z "aps EE т (11) 
2i sin — 
2 
T 
UR 
In faet, 
т 
п m h 
Р(х) РО) = У р. = 52 У [^ J (D e-tta-itrh dt, 
r=m T=m 
=F 
and 
T —Итһ бп) 
$ e-ita—ftrh — e—ita ё ё 
ad 1—e eth 
r=m 
1 
— ith LL —{ i, Журш 
m ith (m ) ith ( >) m fis „—чл 
2 jus i 9 sin ^ 
Pd 2 uin 


From Corollary 2 follows: 
CoroLLARY 3. Every lattice distribution, apart from the improper ones, 
has a unique maximum span. 


Of course, Corollary 3 can also be easily obtained in an elementary 
arithmetical way. 


§ 15. MoMENTS AND SEMI-INVARIANTS 


Ву virtue of the uniqueness theorem of $12 the values of the char- 
acteristic function 


f= f eite dF (x) 


for all ¢ determine the distribution function F(x). It is therefore natural 
to expect that all other numerical characteristics of a one-dimensional 
distribution (or distribution function) can be expressed in terms of its 
characteristic function. For example, we shall soon see that 


Mta ift), (1) 
О = — / (0) — [fe (09). (2) 


At the beginning of this section we shall consider from this point of view 
the most useful numerical characteristics of one-dimensional distributions: 
moments and semi-invariants. 
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The number 
as = Mt? = f ха dF; (x) (3) 


is called the moment of order s. In accordance with § 4, for the existence 
of the moment a, the existence of the absolute moment of order s, 


В. = Mit [еа (x), (4) 


is necessary and sufficient. The following lemma is well known from 
elementary courses in the theory of probability. 


LEMMA 1. If the moment B, exists for the random variable £, then all mo- 


ments В, for k < s exist and 
1 1 


1 
6 EX «bxc cH. (5) 
As for the moments os and f of order zero, they are always considered 
to exist and to be equal: 


do = fy — 1. (6) 
By definition, 
a, = М:. 


For s > 1 it is natural to consider, besides the moments а, and f, the 
central moments 


ns == М(@— Mb f (x—«) dF; (x) (7) 
and the central absolute moments 
vp = МЕ МЕ = f |x — a, l5 dF; (x). (8) 
From the inequalities 
I£[s < 28 {e—a [s+ es]? 
[E— a, |< 25{ [8 [s+] a,/%} 


it is easy to deduce that the existence of the moments и, and v, is equivalent 
to the existence of a, and 8,. Hence in the following we shall speak of the 
condition of existence of moment of order s, without specifying which one. 

It is easily computed that the moments o; and и, are connected by the 
relations 


ро == 1, 
р = 0, 
p, == а, — 0? 


p, = а, — 3a,a,-+ 247, 
p, = а, — 49,4, F баа, — 391, 
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These relations can be prolonged to any s and yield the inversion 


qo == 1, 
a, = 01, 


95 = Hs + Зар, + aĵ, 
a, = p, T 49,6 + Gait + а], 


The connection between the characteristic function and the moments 
is given by: 


Lemma 2. If the random variable Ё has a moment of order Е, then its 
characteristic function f(t) has continuous derivatives up to and including 


the kth order, Moreover 
1 [ds 
a= x [AD] 
Proof. By definition, 


(s—1, 2,..., 4). (11) 


t=0 


ЛО = | ete dF,(x). 

Differentiate this equation formally k times: 
k : ‹ 
А0 = P f xkeitz dF, (x). (12) 

Since by assumption 

[хал (x) < со, 
the integral on the right side of (12) exists and (12) can be proved by 
means of repeated integration with respect to ¢ (inverting the order of 
integrations with respect to t and x in accordance with Fubini's theorem). 
This proves the legitimacy of the differentiation. Putting t 2 0 in (12), 


we arrive at (11). 
Letting 


n= :— 1, 
we obtain for the central moments y, of the variable £ the expression 
1 [ds 
Es = Mns = "s [35/45], ie 
Since 


AO) — etf (4), 


1f@ aq c 
Hala? i^f, (13) 
Comparison of (11) and (13) permits a new derivation of the relations 
(9) and (10). 
The moments of the sum of two independent random variables 


we have 


pg 


64 DISTRIBUTIONS IN RA! [cua P. 2 
can be calculated: 

— all (о 
a, == a + a, 


a, = х0) + 22a) -+ a, 


; (14) 
a, = а) + 390000) + 320902) + a, 
p, HD H n9, | 
= ul) L, 
9s — i He (15) 


p, = p+ 6p Du + 0), | 


From these formulas we see that 
a,=Mé, ы, = 02 and ра 


are additive for independent summands. In $ 11 the question was raised 
whether there exist other numerical characteristics of distributions which 
possess this property. The answer is again essentially contained in the 
fundamental property of the characteristic function 
AO = fal fea (0), 
from which it follows that 
log fe (f) = log fe (0) + 108 fis (0). (16) 

Here, as everywhere in the sequel, log /(0) denotes the principal branch 
of the logarithm of the characteristic function, i.e., the function which is 
defined only for those real t for which f(¢) is different from 0 both at the 
point ¢ and between ¢ and 0, and which is continuous and reduces at 
t=0to 

log 1 0. 


Аз is easily verified, the principal branch of the logarithm of f(t) is deter- 
mined uniquely by these conditions. In this section we need only values 
of log f(t) in the neighborhood of zero. 

If ё has a moment of the sth order, by Lemma 2 the first s derivatives 
of both f:(/) and log f(t) exist at { = 0. Letting 


1 f adr 
X, = [5 log fe Olo 


we can therefore write 
8 


log f, (0 = У (ity + o (£9). (17) 
r=1 
From (16) and (17) it follows that for the sum of independent variables 
the corresponding coefficients x, add up: 


рн, (18) 
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The coefficients x, in (17) are called the_semi-invariants of the random 
variable £ (or of its distribution). The semi-invariants up to and including 
the order s exist, if the moment f, exists. 

The semi-invariants up to the sth order are uniquely determined by 
the moments up to the same order. Putting 


w == it 


and noting that by (11), 

8 

а А 3 
А0 = У) 5 ty +0 (е) (19) 

r=0 
provided the moment a, exists, we can write the relations between the 
semi-invariants and the moments in the form of an equation between formal 
power series: 


D = wr = log У = gr. (20) 
r=1 r=0 
This gives 
X4 4, = ME, 
xr, = а, — a? = Dé 
X, = а, — За a, + 28, (21) 
x, = a, — 3a? — 4a ist 12a2a, — bat, 
and 
a, = Y, 
a, = x, + х}, 
a, = x, F 3х, x1, (22) 


a, = х, 3х9 -]- 4хүх, + 6xix, + x4 
From (17) and (19) we deduce that, for any r < s, x,/r! is the coefficient 


of z' in the expansion of log ( + 2 T1 а) as а power series in 2. By (5) 


this is majorized by the series 


тзн 





со ( 3j oo b 
—kg|1- У am lp ‘ay. 
k=1 k=1 
Consequently, 
a 
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that is to say, 


| x, | <= rB,. (23) 
For later purposes it is expedient to make use of the formula 
8 
a ч ; 
ets fo @ = 1+ M Er Gt) d o (£*), (24) 
r2 


and to write down the connection between the semi-invariants and the 
central moments as a formal equation: 


Ў = w+ log [1+ ш” ]. (25) 
rzi r=? 
This gives 
x {= а = МЕ, 
X9 = р = D?t, 
X3 = ра, 
x, = pa — 3p, (26 


% = Bs — 10рарз, 
x, = pe — 19p,p, — 10p3 + 30p5, 


If we count the semi-invariants as the basic characteristics of probability 
distributions, then the simplest distribution with given 


4 = Mt, X9 = D?: 
should be the distribution with the characteristic function 


int ga 


f(—e *, 
i.e., the normal distribution with the probability density 
(т — х1)? 
1 TU | 


е 


PO) тоя 


СНАРТЕВ 3 
INFINITELY DIVISIBLE DISTRIBUTIONS 


§ 16. STATEMENT OF THE PROBLEM. 
RANDOM FUNCTIONS WITH INDEPENDENT INCREMENTS 


Classical limit theorems, the generalization and strengthening of which 
constitute the principal content of this book, have to do with sums 


: 
“= that ee 
of an increasing number of terms of a sequence 
En ur ee ре 


of independent random variables. 

It is possible to imagine an analogous scheme, in which the index n, 
taking only integral values, is replaced by a continuously varying pa- 
rameter А. There are no longer any elementary summands £,, but it is 
possible to carry over the requirement, which in the discrete case follows 
from the independence of the variables £,, that 


їп, = Ёл, Cn, EMT бл. жеў Un, us bari 


are independent if n, < т < +++ < n. 

We confine ourselves to the continuous analogue of the case in which 
all elementary summands £, have the same law of distribution. Then the 
laws of distribution of the increments fm — t. depend only оп the number 
of summands entering into them, i.e., on the value of the difference m — n. 

Our continuous scheme will look like this: to each real А > 0 corresponds 
a random variable t such that 

(1) ёо is identically zero; 

(2) the law of distribution of the difference fm — фм, With X» > №, 
depends only on the difference à» — А; 

(3) for Мм < № < ++ № the differences 


б. — bao Q, — &,, LEE] bie — cd 


are mutually independent. 

The problem of this chapter consists in the study of those distributions 
of the variable ¢ which are consistent with the scheme presented here. 
From the conditions (1), (2), and (3) it follows that фо, for any natural 
number A = n, is the sum 


Q= 1, H7 7 |... d^ in 
67 
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of n identically distributed independent summands 


li^ 





Nk = Ск, — Cy 
n т 
This circumstance forms the basis of the formal definition of ‘infinitely 
divisible distribution" in § 17. 
In Chapter 4 it will appear that infinitely divisible distributions play 
a fundamental role even in the classical problems of limit theorems for 
discrete sums of independent random variables. In the later sections of 
this chapter the properties of infinitely divisible distributions will be 
studied, by preference, purely analytically, by means of the characteristic 
funetion. However, almost all the results of this chapter were first dis- 
covered heuristically starting from the preceding scheme of the random 
function §, with independent increments, depending on the continuous 
argument А. 
Before the construction of the general theory two basic elementary 
types of such random functions were known: 
(1) The normal type, in which the characteristic function f(t) of the 
random variable фу is given by the formula 


log f, (0) (ist — 8. (1) 


(2) The Poisson type, in which the characteristic function f(t) has 
the form 


log f, (f) = Ac (ei^t — 1). (2) 


It is possible to show that the normal type is the only one that can arise 
when фу, as a function of А, is continuous with probability one (see $ 26) 
(a somewhat weaker proposition is proved in $ 2 of Glivenko's book [32]). 
The Poisson type arises when f» as a function of А is, with probability one, a 
nondecreasing step function taking only values which are multiples of the 
"span" h (this case of discrete jump-like random process is developed in 
$ 2 of Glivenko's book [32]). 

It is natural to try to build up a function фу combining these two types 
of variation and admitting not only jumps of a fixed magnitude h, but of 
all sorts of magnitudes. Let us, then, suppose that in the interval (A; + dà) 
à jump occurs with probability cdÀ, and that the distribution function of 
the magnitude of the jump is 


P(A <и) = Е (и). 
Then by combining (1) and (2) we arrive at the formula proposed first 
by de Finetti (see [30]): 


log fa Q) = firt — pote f em naron- (3) 
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The formula (3), however, by no means yet gives the general solution 
of the question. In case f» has a finite variance the general solution was 
found by А. N. Kolmogorov (see later $ 18). To this end two difficulties 
had to be overcome. First of all it is necessary to take into account the 
fact that jumps of small magnitudes can occur very often, and the full 
“density of jumps" may be infinite. Since jumps with large absolute 
values * cannot occur with infinite “density,” it turned out to be possible to 


introduce two functions M (и) and N(u) such that in the interval (A, A + dA) 
the jumps й, 
h«uc 0, 


occur with probability M (wu) dd, and the jumps h, 
h>u>0, 


with probability N (u) dà. Гог u= 0 both these functions may become 
infinite. (3) now becomes 


0 
log 0 = Mit —®-+- | (eff — 1) dM (u) 


+ е) амо), 09 
0 
where the integrals are understood to be the limits T 


0 a 
[е 1) dM (uw) = lim f (eiut — 1) dM (u) а<0, 
a0 7 


со со 

[ (eit — 1) dN (и) = lim f (e"— 1)аМ (а), a 0. 
а» 0 > 

0 а 


The second difficulty consists in that it is possible to have a function 
¢ for which the integrals 


0 а 
f u dM (u) and f u dN (и), а;>0, 
0 


—a 


* Translator's note. That is, with absolute values bounded away from zero. 
1 In connection with the integral 


[ (eint — 1) aN (пу, a>0, 


it should be remarked that it is understood to be 


f c — Dudu, 
u >a 
where the measure и(А) is determined by the condition 


p (и; со) = — N (u). 


70 INFINITELY DIVISIBLE DISTRIBUTIONS [снАР. 3 


and consequently also the integrals in (4), are divergent. This means that 
the mathematical expectation of the sum of the jumps of small magnitudes 
|A| < a can be infinite. Roughly speaking, such an infinity can be compen- 
sated by introducing in the expression for log f,(t) a term Муѓ with an 
infinite value y. This compensation is made rigorous by introducing a term 
proportional to z¢ inside those integrals whose divergence must be compen- 
sated. This leads to the formula 


log f, (д 


=A [n + miim) dM w+ | (ei vt —1 — itu) ам (u)}, (5) 


which already represents the general form of the logarithm of the char- 
acteristic function for variables (t of finite variance. 

To cover the case of infinite variance, the correction term under the 
integral sign must be introduced with greater care. This was done by 
P. Lévy, who gave for log f,(t) the formula which is valid in the general 
case: 


т 
B 


log A) =A fFe TE a) aM (u) 








0 


zs || (em —i— Ban}. (6) 


The formula of Lévy and Khintchine 








log A (0) = {+ | (e —1i 2 Tig) 640 w}, (7) 


is obtained from (6) by introducing the function G(u) with the properties 
1 t u? 





(1) dG(u)—dM(u) for u«90; 
(2) Lee dG(u)—dN(u) for иљо. 


(3) The jump of G(w) at u = 0 is equal to о?, and the integrand at zero 
is defined by continuity: 


(eit —1— йи jie Ex 
Lu?) u? |20 2 





The function G(u) dos not have a simple intuitive meaning, but it is 
more convenient to use in proofs than are M(u) and N (u). 
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$17. DEFINITION AND Basic PROPERTIES 


We shall say that the random variable £ is infinitely divisible if for 
every natural number n it can be represented as the sum 


6 = tni F Snot ets tan 


of n independent identically distributed random variables n, En, - - 
Enn. 

The distribution functions of infinitely divisible random variables will 
be called infinitely divisible distribution functions. 

Obviously, the distribution function F(z) is infinitely divisible if ł and 
only if its characteristic function f(f) is, for every natural number n, the 
nth power of some characteristic function f,(¢) (which depends, of course, 
on n): 


Жу 


/@) = P. (1) 


The formula 
fa) = Vf) 


does not yet uniquely determine the values of f,(¢) in terms of the values 
of f(t) (the nth root has n values). But the additional requirements - 


(1) f, (0) = 1, 
(2) f. (0) is continuous 


make it possible to determine f,(/ uniquely in every interval of £ con- 


taining the point t= 0 in. which / f(t) does not vanish. We shall deal only 
with such a principal branch 577 (0) in what follows. It will soon appear that 
the characteristic function f(t) of an infinitely divisible law never vanishes 
(for real t). Therefore f,(£) and its corresponding distribution function 
F(x) are uniquely determined by f(t) (or F(z)). 


We shall cite some examples. 





EXAMPLE 1. A normally distributed random variable £ is infinitely divisible. 
In fact, suppose that M£ =a and Ші = o°; then we know that the 
characteristic function of £ is 
f (£) =; ge — T 
Since for every n > 0 


2 
fict t? 


pu 
f esee nm 
is the characteristic function of a normal law, our assertion is proved. 
EXAMPLE 2. A random variable £ distributed according to a Poisson law 


ts infinitely divisible. 


t Translator's note. See, however, pp. 247-248. 


72 INFINITELY DIVISIDLE DISTRIBUTIONS [cua r. 3 
Suppose that the possible values of £ have the form a + kh (k= 0,1,2, . . .), 
and that 
Мо 


P{&=a-+ka}=—, (A > 0). 
Then the characteristic function of £ is 
/ (£) == ett + a (ef th 4) : 





From this we conclude that for every n > 0 


LA А оой 
= t+ " (e* ^ —1) 


"n 
f(f)=e 
is the characteristic function of a random variable which is also distributed 
according to a Poisson law. This proves our assertion. 


ExaMPLE 3. A random variable distributed according to a Cauchy law * 


Рх) = (ж + arctg 75) (a > 0) 








a 


ts infinitely divisible. 


In fact. it can be calculated that the corresponding characteristic 
function is 


f= git M 
This proves our assertion. 
ExAMPLE +. A random variable £ with the probability density 
0 for x «0, 


Т (аў xen pa for x > 0, 
where a > 0. В > 0 are constants, is infinitely divisible. 


The characteristic function of £ is 


For every natural number n 


Vo-(-2)* (2) 
is again the characteristic function of a distribution of the same form as 
the initial one. 

This distribution belongs to the system of Pearson curves of the third 
and the tenth types. It is known in statisties as the x? distribution, if 
B = 1 and 2a is an integer. —— 





TueorremM 1. The characteristic function of an infinitely divisible law 
never vanishes, 





* The Cauchy law appeared for the first time in [14]. 
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Proof. The assertion of the theorem is an obvious consequence of (1) 
and Theorem 4, § 14. 

It is easy to verify that there exist any number of characteristic func- 
tions which do not vanish but at the same time are not infinitely divisible. 
For example, consider the discrete random variable taking the values 
—1, 0, 1 with the probabilities å, 2, &. Its characteristic function 

/ bent S etie = at cost 
is positive and therefore does not vanish. Not only is the variable not 
infinitely divisible, but it cannot be represented as the sum of two identi- 
cally distributed independent variables. In fact, suppose that 
=T 5, 
where £; and & are mutually independent and identically distributed. 
It is evident that each of the summands сап take only two values a, and 
а» (а, < аз), with probabilities p and q = 1 — p respectively. The possible 
values of £ + & are the numbers 2a, аа + a», and 20. The probabilities 
of these values are, of course, equal to р?, 2pq, and g?. Since, by hypothesis, 
2a, = —1, а + a; = 0, 2a, = 1, while { = р?, 2 = 2pq, and 4 = ф, we reach 
a contradiction, for the last three equations are inconsistent. 


THEOREM 2. The distribution function of the sum of a finite number of 


independent infinitely divisible random variables is itself infinitely divisible. 





Proof. Obviously, it is sufficient in the proof to confine ourselves to the 
case of two summands £ and n. If f(f) and g(t) are the characteristic func- 
tions of these variables, then, by the eondition of the theorem for every 
natural number n, we have 


SAS (7, 0) 1". EOS (sal) }% 
where f.(f) and g,(t) are characteristic functions. The characteristic 
function A(t) of the sum ¢ = £ + 7 satisfies the equation 
В) = f(D- g (9 = Us: 8. (0) )" 
for every n, which obviously proves the theorem. 

We remark that the converse proposition is not true and that it is 
possible to give examples of random variables which are not infinitely 
divisible but whose sum is infinitely divisible. We postpone the con- 
sideration of such exampies until the next section (example 1). 


THEOREM 3. A distribution function which is the limit, in the sense of weak 


convergence, of infinitely divisible distribution functions is itself infinitely 
visible. 











Proof. Let F®(x) be an infinitely divisible distribution function and let 


FO (ху <> F (x) 
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as k — œ, where F(z) is a distribution function. If f®(¢) is the char- 
acteristic function of F® (x), and f(t) is that of F(z), then 


f°) — SA. (3) 


By the condition of the theorem, for every n the function 


ft) Vr? 


is a characteristic function and never vanishes for any t. From (3) we 
therefore conclude that for every n 


(у => f, (0), k — оо. 
From Theorem 2 of $ 13 it follows that f,(¢) is a characteristic function. 
Since for every natural number n the equation 


0 = [fh O F 


holds, the theorem is proved. 


THEOREM 4. If f(t) is the characteristic function of an infinitely divisible 
distribution function, then for every c > 0 the function {f()}¢ is also a 
characteristic function. 


Proof. In fact, for c= 1/n, when n is a natural number, this follows 
from the definition of an infinitely divisible random variable. By the 
theorem on the multiplication of characteristic functions this assertion 
remains true for any rational number c > 0. Finally, for an irrational 
number c > 0 the function (f(t)]* сап be approximated uniformly in every 
finite interval ¢ by the function {f(é)}*, where cı is a rational number. 
Hence our assertion follows from a preceding theorem. 


THEOREM 5. The totality of infinitely divisible distribution laws coincides 
with the totality of laws which are composed of a finite number of Poisson 
laws and of limits of these laws in the sense of weak convergence. 








Proof. That the composition of a finite number of Poisson laws and 
their limit laws is infinitely divisible, follows from Theorems 2 and 3. 
We shall prove the converse. Let f(t) be the characteristic function of an 
infinitely divisible law. By hypothesis, 


fa = VIO 


is a characteristic function; thus 
Sa (2 = f erm dF, (x), (4) 
where F,(x) is a distribution function. We have * 


* This relation can be proved, e.g., in the following way: 


n liga 
n(Ya—1)2n (e^ F =) =n(1 пова +0 (2) 1) оова. 
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n (fn (t) — 1) => log f (t) as п оо (5) 


and therefore 
ма =f), n со. (6) 
We represent the integral (4) as the limit as m — œ of the Stieltjes sum: 


m 
2 eR [Е (cy) — Fn ler) = 7,0). (7) 
Put 
ак= п [Fn (cx) — Fa (ex. 
Comparing (6) and (7), it is easy to see that 
P e^ в) 
ek= => f(t): 
Q.E.D. 
We shall make use of the last theorem to construct examples of infinitely 
divisible distributions. 
EXAMPLE 5. The function 
f (t) = (1 —5)-(1—5et) (0<0< 1) 


is the characteristic function of an infinitely divisible distribution. 


First of all, from the equation 
f(t) = (1—0) D bneim 
n=0 


we conclude that f(t) is the characteristic function of a random variable £ 
which takes only non-negative integral values with the probabilities 


P {En} = (1 — b) b” (n=0, 1, 2,...). 


It is easily calculated that 
оо 


log /() = У) (e —1) Я, 


k= 
Since each separate term of this sum is the logarithm of the characteristic 
function of a Poisson law, the assertion is proved. 


EXAMPLE 6.* Let t(s) = (о +7) be the Riemann zeta-function defined 
for o > 1 by means of the series 


C(s)== Ф n7 
n=1 
or the Euler product 
C(s)= 101—9), 
р 


extended over all prime numbers. 


* A. Ya. Khintchine [59], p. 35. 
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We shall prove that for every ¢ > 1 the function 


| C(s-- it) 
Je 


is the characteristic function of an infinitely divisible distribution. In fact 


log f ()= M Пов (1 — p-*) — log (1 — p-5-*)] 


со 
У теи ша =ў у а 1) 


m=1 р т=1 


where the symbol Y, denotes that the summation is extended over all the 
prime numbers. 

Each term of this sum is the logarithm of the characteristic function 
of a Poisson law. According to Theorem 5 the characteristic function /(0) 
is infinitely divisible. 


$18. Тнк Canonical REPRESENTATION 


THEOREM 1. In order that the function f(t) be the characteristic function 
of an infinitely divisible distribution, it is necessary and sufficient that 


its logarithm be representable in the form 


tog /(0 — ith [ (ess — 1 т | dG(u), — (1) 


where y is a real constant, G(u) is a nondecreasing function of bounded 
variation, and the integrand at u = 0 is defined by the equation 


Paes ee itu \1+и? Es 
IF uj u? juzo 2. 


The representation of log f(t) by the formula (1) is unique. 
Proof. Necessity. Suppose that F(x) is an infinitely divisible distribution 
and f(t) its characteristic function. Then for every n > 0 


f(t) Баш [Sa GOI^, 


where f(t) is a characteristic function. Since f(t) # 0, according to (5) of 
§ 17 we have 


n Ifa (2) — Mo n f (ei —1) dF, (x) => logs (, 





where /’, (х) is the distribution function corresponding to the characteristic 
function f,(). Put 


G, be Jus ——, 4F,, (х) 
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and 








L(t) == f (eitu — 1) 1 е (2) 


Then by Theorem 2 of § 4 the preceding relation may be written as 
la (t) => log f (0) (3) 


and we conclude that * 
Re In = | (с 
=> | (cos ut — jess s dd (и) = log Lf (t). 


We shall prove that G,(+) is bounded. For this purpose consider the 
expressions 


An = f dG, (u), B, = f dG, (ш), Сд == An + В, = fao, (u). 
lulgi luj>i 
Let 0 < t € 2. It is evident that for every e > 0 and for sufficiently 
large n 





—log|f(t)| -+e > > | (1 — cos tu) t aa, (u) 


К 
апа 
--—log|f(t)| +> > | (1— costu) +, dG, (и). 
luj>1 
For |u| < 1 
=” 


Dee 3 , 
hence the first inequality above gives 
1 
—1ор|/(1)1-Е# >з Аа: (4) 


Taking, in the interval 0 € £ < 2, the mean of the functions on both 
sides of the second inequality above, we obtain 





-4 festa Г"), u) >$ B, (5) 
lul>1 
2 
Since the quantities log |f(1)| and Jl log |/(Ð| dt are finite, it follows 
д 


* Translator's note. The existence of a G(u) such that С„(и) => G(u) will appear 
in the course of the proof. This fact, however, is not needed at all. Thus the second 
integral should be deleted from the formula below. 
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from (4) and (5) that G,(4-oo) is bounded. We shall now prove that 


dG,(u)— 0 as T— оо 
lui >? 


uniformly with respect to n. In fact, for every e > 0 and for sufficiently 
large n 


—log|f(t)|+e > f (1 — cos tu) dG, (и). 
lui2T 


Taking the mean in the interval 0 < t € 2 (T > 1) of both sides of the 


inequality, we obtain 
2 


T 2u 
T sin = 
1 f wglfolate> f \ = m aont, 


But for |u| > T 





and for T > T, 


№ 


т 
| лов 170142 |< max fog / O1 ee. 
2 


0«t« 


Therefore for T > T, 


[| 4G, (и) < 4e. 
ш> T 


Now on the basis of Theorem 3 bis of § 9 we can choose a subsequence 
from G,(u) such that 


Gy, (u) => G(u), 
where G(u) is a nondecreasing function of bounded variation.* 
Put 
dG, , (u) x 
ty = | Se <= п, | Iur aru, 0s 


then it is evident from (3) that 


I (= | [emi HH IUE ao, (a) itn, 








* Translator's note. The last clause is added for the sake of clarity, cf. the pre- 
ceding note. 
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The integral on the right side of this equation, as k — оо, converges to 


ffem—1— E | LU 4G (u). 


From (3) we conclude that ym must converge to some number y as 
К — o. 

Thus the first part of the theorem is proved. 

Sufficiency. Suppose that (1) holds. According to § 6 the integral on the 
right side of (1) is the limit of the Stieltjes sums (all c, are taken to be 
different from zero): 


m ис, 14 
Shea- ]— E | [G (e) — G (e, D) 


1+ су. 
k=1 T 6k 

















j ? 
=> | (ei —1— га) a dG (u). 

Each term on the left is the logarithm of the characteristic function of a 
Poisson law. 

Applying Theorem 5 of $ 17, we confirm that f(t) is an infinitely divisible 
characteristic function. 

It remains to prove the uniqueness of the representation of the logarithm 
of an infinitely divisible characteristic function by (1). From (1) it is 
easily deduced that 


t4i 
— v (t) = | log f (z) dz — 2 log f(t) 


TS | eitu (1 — TE) LEH ag qu) 








Putting 


uw 


va) -2 | (1 — sav) 13 E. dG (v), 





we find that 
000 = f чау (a). 


The function У (и) is nondecreasing, hence by Theorem 2 of $ 12 it is 
uniquely determined by its characteristic function v(t). 


Since for all v 
sinv\ l+? 
(1 v ) vi 
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according to Theorem 3 of $ 4 the function G(v) is uniquely determined 
by the function V (i1). Q.E.D. 

We shall call (1) the formula of Lévy and Khintchine as already indi- 
cated in $ 16 (see [74] and [56]). 


Conorrany. Lf the logarithm of a characteristic function is representable 
in the form (1), where G(u) 15 a function of bounded variation (not neces- 
sarily nondecreasing), then such a representation is unique. 


Proof. Suppose that f(t) has two representations of the form (1) with 
functions Gi(u), Gs(u) and constants yı, ys respectively. 

Let G1(G:) be the positive, G;'(G;') the negative variation of the function 
Gi(G2) and let h(t,u) be the integrand in (1). 

We obtain then [cf. (2), $ 8]: 


it ++ Í A(t, u) aG,(u) 
= iy + f A(t, u) dG; (u)— [л (t, uj dG; (и) 
= imt 4- f A(t, u) dG, (u) 


= it + f AQ, u) dO, (u)— f A(t, u) dO, (u), 


whence 


4114 fats и) d (G; (и) + 0 (и)) 
= imt ја (t, u) d (G; (а) + О» (u)). 


But the functions Су(ю) + G; (u) and Gi’ (u) + С (и) are nondecreasing; 
hence in the equation written above, representing the logarithm of the 
characteristic function of an infinitely divisible law, we must have 


= үр and G; (а) + G; (и) == 01 (u) + G; (u), 
that is, 
G, (и) — О (u) = Gs (и) — О» (и). 
This is equivalent to 
Gi = О», 
Q.E.D. 


We shall make use of the formula of Lévy and Khintchine to construct 
examples of infinitely divisible characteristic functions. 
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EXAMPLE 1. Consider the function 


_1—#@ 1+ae-it 
f(0— Fa 1—fet 


(0 «ax <1). 


This function is continuous, /(0) = 1 and 


J(= É fae- + (1 + af) 2 presne] | 





Hence it is the characteristic function of a random variable taking all 
integral values from —1 to +оо ; moreover 





Рр в, P {к=п} — 28 (1408) f” [n—0, 1,9, .. .]. 


1--а 


We observe that f(t) is not an infinitely divisible characteristic function. 
For, 


со 


log f (j= 2. [< Int 2 (e-int — 1) E (ein — 1) | 


n=1 





y it 1 1 
= f (6 —1—4 T LHP dG(u), 


where, as can be easily caleulated, 


and G(u) is a function of bounded variation, having jumps at the points 


Wesel, eZ IE us 


of magnitudes 


ET for u=-+n 
and 
(— 19-7 -EFT for u= — п. 


Thus G(u) is not monotone. Hence according to the Corollary of Theorem 
1 of this section, f(t) cannot be the characteristic function of an infinitely 
divisible law, Q.E.D. 

The function 
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is also a characteristic function, and 


со 


log f() = Y Е (e7fnt — 1) + (—1)n7! T (eint — )] . 


We shall prove that 
&()—f(0f0 =170) 


is the characteristic function of an infinitely divisible law. 
In fact, 


œ 


log g () = Y, = (Pe С 17 a^) (e»t — 1) 


n-1 


+ У E (87 + ( PER 1)”-!ат) (eint nes 1) 
n=1 


; itx 1+ x? 
= ffei T+ x2 VLE ao (9), 


where G(x) is a nondecreasing function with jumps at the points +1, +2, 
+3, ... ; the jumps at the points +n and — are equal; their magnitude is 





n (B» - ( — 1yn-1a7) 





for n > 0. 

Tt is interesting to note the following: f(f) is a characteristic function 
but is not infinitely divisible; its modulus [/(£)| is an infinitely divisible 
characteristic function; the infinitely divisible characteristic function 
|f(t)|? is decomposed into the product of two characteristic functions f(é) and 
J(t), neither of which is infinitely divisible. 

We remark further that by our example we have also proved the asser- 
tion that there exist essentially different} characteristic functions whose 
moduli are the same. 


EXAMPLE 2. It is easy to construct examples of infinitely divisible 
characteristic functions f(f) with even more striking properties: the func- 
tion f(t) is decomposed into the product of an infinitely divisible character- 
istic function and two characteristic functions, neither of which is decom- 
posable (A. Ya. Ixhintchine [57]). 

The even function 


5+ 4cost 
pO = log EE 


Ї Translator's note. Namely f(t) and |f(t)] in the example. Of course, for every 
characteristic function f(t) there are trivially different characteristic functions with 
the same modulus, namely, е' f(t) and e' f(t) for every real a. 
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of the real argument ¢ has the period 2r and derivatives of all orders, 
hence 


Ф (0 = Жа, cos nt, У |а, | < + оо. 
n= n=0 


Since 9 (0) = 0, 


and consequently 


00 = Xa, (cosnt—1), 


Let р, ps, ..., Pr, ... be the non-negative and =ni, —ns ..., =... 
the negative numbers among di, аз, ..., dn,.... Thus 


oo со 
y= X рк (cosh £—1) — У) ny (cos m,t—1), 
== 1 К=1 
whence 


со со 
Пп ек (cosh t=) — 5400; П ek (cos тр) 

The infinite products оп both sides of the equation, being the products 
of the moduli of the characteristic functions of Poisson laws, are them- 
selves infinitely divisible characteristic functions. 

In fact if f(t) is an infinitely divisible characteristic function so is |/(ġ|? 
and hence also [/(/) |. It remains to apply Theorems 2 and 3 of $ 17. The 
5+4cost 2te* 2 + е“ 

9 3 38 
functions, each of which is indecomposable. Thus it is possible to find an 
example of an infinitely divisible random variable represented as the sum 
of three independent random variables, one of which is infinitely divisible 
while the other two (nonconstant) are indecomposable into independent 
summands. 

Define the functions M(u) and N (и) and the constant о? by setting 








function is the product of two characteristic 


и 











M (u) = | 1 Lu dG(z) for «<0, 
2 rs (6) 
N(u) = — | +2 dG(z) for ис>0, 





o? = G(+0)—G(—0). 





The functions M (u) and N (u) 
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(1) are respectively nondecreasing in the intervals 


(2) are continuous at those and only those points at which G(u) is 





continuous; 
(3) satisfy the relations 


and 
n 


f u? dM (u) + (anizta 


—6 





for every finite e > 0. 

Conversely, any two functions M (u) and N (u) satisfying the conditions 
(1) and (3) and any constant c > 0 determine by (6) the characteristic 
function of some infinitely divisible law. In terms of M(u) and N(u) we 


can write (1) in the following form (cf. (6), $ 16]: 





0 
А 2 Pos iut 
log f (f) = it — 5, e4 | (eiut — ]— га) 4м (и) 





+ | 


0 


(esus „үм. Lr) dN (u). (7) 





We shall call (7) Lévy’s formula. 
Finally, we can give (1) the following form: 





log f (f) = 4(9)1—5 B+ | (et—1) aM (u) 


— 0D 


eo 0 
+f (e*t — 1) aN (u) + f (eit — 1 — iut) dM (u) 


+ ен 1 – шамси), (8) 


where М (и) and N(u) have the same meanings as іп (6), and т is an arbi- 
trary constant, chosen so that т and —т are continuity points of the func- 
tions А (и) and M (i0 respectively. The relation between y(r) and the y 
in the formula (1) is easily found: 


10-14 f иаб (а)— | + dG(u). (9) 


{мі <1 [ul] >t 
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Lévy's formula and that of Lévy and Khintchine are generalizations of 
Kolmogorov’s formula, which was found by him as early as 1932 [64]. 
for infinitely divisible laws F(x) of finite variance. It turns out that in that 
case F(x) is infinitely divisible if and only if 





log f (£) = iyt + [ Leitu — ] — ifu} aK (и), (10 


where y is a constant and Кї) [K = 0] is а nondecreasing function 
of bounded variation. The representation of log f(f) by this formula is 
unique. 

We shall call (10) Kolmogoror's formula. An саву calculation shows that 
in this case 





Е log f (t) |= Mi = iy, 


t= 


Гло] ео | dK (u), 


whence 
ү=М and K(+co)=D’'E., (11) 
These formulas clarify the probability meanings of the constant y and 
the variation of К(и) in (10). 
Kolmogorov’s formula can be obtained either from that of Lévy and 
Khintchine considering the function 


u 
К(и) = f (1 + v?) dG (v), 
-- 00 
or in the same way as the formula of Lévy and Khintchine itself was 
obtained. 
For later purposes we shall need to know the representation of a normal 
law and a Poisson law by means of (1), (7), and (10). It is easily caleulated 
that for the normal law 














P {Куе ы (d 2° dz (12) 
c 


we should put in (1) and (10) 


0 for uxo, 
y =a; К(ш)= a = „ for u> 0, (13) 
and in (7) 
y=a, M(u)EO, N(u)=0, з=. (13^) 


For а random variable distributed according to the Poisson law with 
the characteristic function 


РО) = ect-», 
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we should put in (1) 


" 0 for uxl, 
1= 5, с-з for и> 1; (HO 
in (7) 
1—3, o=0, M(u)z0, N=] | E e (14’) 
and in (9) | 
0 for wel, 
Tossa к@)=| A for ul. Of) 


We remark that the form of the function G(u) in the formula of Lévy 
and Khintchine for a normal law and a Poisson law yields the following 
assertion: 


If the composition of two infinitely divisible laws is a normal (Poisson) 


law, then each of the components must also be a normal (Poisson) law. 
In fact, if Gi(u) and Gs(u) are two nondecreasing functions and 


G, (u) + Gs (u) = G (u), 
where G(u) is defined by (13) [or (14) in the case of a Poisson law], then 
both G;(u) and G:(u) can have only one point of increase at u = 0 (or at 
u= 1). 
We shall now calculate the function K (и) in Kolmogorov’s formula for 
the distribution considered in Example 4, $ 17. To this end we shall go 


through, for this example, all the computations made in the proof of (1). 
Thus we put 


T 


Kn (x) =n f г®р„ (z) dz, 


— со 


where р, (х) is the density of the distribution whose characteristic function 
is determined by (1) of § 17, i.e., 


0 fo x<0, 
Pn (x) = | 2-1 ва 
CX e for x0, 
where e == Lm. | 
es) 
Thüs, | 0 for x <0; 
K, (x) 14 


===; a | Е 
| HC, fz "е8 dz for x>0. 
0 
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Now we remark that 


nU pg = — >a аз Noo 


and for every t > 0 
2 
K, (х) а (е. dz == K (x) (п co). 
0 
Moreover, as n — oo 


co TE NT r (24.5) " 
Kn (+) = лс, |2 "e  dzzsnc,——7— >p = К(- оо). 


0 кт 


We find also that 
у= Мё= с | xte-Bx dx == 
0 


Thus, according to (10) we have the equation 


сТ(а +1) а 
рр ер 


A T -Br 
log f (t) =—alog(I р) = f (eitr — 1 — itx} LI dx. 
0 


$19. CONDITIONS FOR CONVERGENCE 
OF INFINITELY DIVISIBLE DISTRIBUTIONS 


THEOREM 1.* For the convergence of a sequence {F,(x)} of infinitel y! 
divisible laws to a limit law F(x) it is necessary and sufficient that as. 


п - © 
u un 
(1) G(r) => Gy), 
(2) y. > Y, 


where the functions G,(x) and G(x) and the constants у, and y are defined 
by the formula of Lévy and Khintchine for F(x) and F(x) respectively. 


Proof. Necessity. Suppose that F,(r) = F(x). Then f,(t) = f(t) according 
to Theorem 1 of $ 13. Since fa(t) and f(t) do not vanish for any t, as n — oo 


D , ; 1 Д 
= | (6 —1 т) EU ao, co 


Soir Гое) sto 











* В. У. Gnedenko [37]. 
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From this we conclude that 


1+ u? 


и? 





Ке I (= f (cos uf — 1) аа, (и) 





=> | (cos ut — 1) E 4G (и). 


u? 


We now deduce, in literally the same way as in the proof of Theorem 1 
of $ 18, that the set (G,(x)] is conditionally compact. 
Let us take any convergent subsequence 


Gp, (x) S» O* (3. 
Then 
| ми yl и? 
f (e —à —HEa)- a da, (и) 
itu 1+ и? 


=>> [| (6: — 1— Ira) 5, dG*(u). 





On the other hand, 





, rae itu Ni +u? 
н ) (e Ет Fa) = dO, (#) 


2 
=> itt | (eit — 1 — a dG (u). 
Hence the sequence yr, has a limit y*, and by virtue of the uniqueness of 
the representation of an infinitely divisible law by the formula of Lévy and 


Khintchine we must have y* = y, G*(u) = G(u). 
Sufficiency. From the conditions of the theorem it follows at once that 


log fa (£) > log f (0) 
for every t, that is, 
ЛЛ () 2 f (t), 


Q.E.D. 
In the sequel we need Theorem 1 in another form [37]. 


THEOREM 2. For the convergence of infinitely divisible distribution functions 
F(x) to a limit distribution function F(x) it is necessary and sufficient 


that as n — oo 
(1) M,(u) > M (u), №, (и) 5 N(u), 
at the continuity points of the functions M (u) and N (u), 


(2) 12 6) > 1), 
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£0 n> 


(3) lim iim if feam f ean, (u)\ 


0 € 

— lim lim | f iam, ш) + f wan, (u)) =, 
t0 n-oo' t 0 

where the functions M,(u), М„(и) and M (u), N(u) and the constants on, 

ү.(т) and e, y(r) are defined by (6) and (9) of $ 18 for the distribution 

functions F(x) and F(x) respectively. 


Proof. Let Р, = F. 

The necessity of the conditions (1) and (2) follows from the necessity 
of the condition G, = G of the preceding theorem, the formulas defining 
M (x), N(u), and y(7), and the theorem at the beginning of $ 9. 

Further, let —є and +e be continuity points of the functions M (1), 
M (u) and N,(u), N(u). Then, putting 


0 


n= 0,6) 0, 9 | rr Ms beh + | cento, 


!()— 0()—6C- 9 [тайм wre | avo. 


we have 
In (е) = 1 (e) (n оо). (1) 


On the basis of the relations 





0 0 0 
1 * 
гё wdM,,(u)< < va Dra dM, (u) « « | u? dM, (и), 


І ? 
pus f ean 0 < < | fth, (u) < <) а? dN, (и), 
we conclude that 


ial f u* dM, (u) dep f anto} «T, (е) 


< | | u* d M, (и) + on + | изам, (wh . (2) 


— є 


Finally, from (1) апа (2) we deduce that * 





йт тү Af иЗ dM, (и) -- 9s + | wan, (wh < 1) 


0 s 
Sum | J u* aM, (u) to, + | u?dN,(u)] ‘ 


* Evidently the same inequalities hold for lower limits. 
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As e tends to zero, both sides of the inequality above have the same 
limit, namely, 
lim / (e) = 0?, 
e0 
Thus the necessity of the condition (3) is proved. 
Let us prove the sufficiency of the conditions. To this end we shall 
prove that they imply the conditions of Theorem 1. 
Dy the theorem at the beginning of $ 9, we have 


м 


G, (и) = i 1} z? aM, (2)> Es IFz ix zi 4M (2) = G (и), (3) 


u « O(n > co) 
at the continuity points of M (1) [and consequently also at the continuity 
points of G(u)]. From this we obtain 
lim lim О, ( —e) = G( — 0). (3) 


е0 7i-xoo 


Ву (2), we have 


0.(— тр { Jen acea foa, (и) < G, (+e) 
0 
«G,(—94 | | u? dM, (4) -- 9? + | u? dN, (и) ү 
—t о 
Hence Бу (3^) and the third condition of the theorem, 
lim lim О, (4-е) — lim limG, (+e) = G (— 0) +02 = G (+ 0). 
€>0 n-»co € 0 n > со 
For every ш > 0 and u: > 0 which are continuity points of the function 
С(и) 
| прим) Е mn rein (и), (4) 
чү 
so that [e is a continuity point of М(м)] 
iim О, (и) = lim їп п (0, (е) + | гаи) 


= tim (0,09) + Í TE 4N, (4)) = tim Gy (0) = 0 (u) (а>0). (5) 


Thus we have proved that 
G, (u) > G(u) (n ә оо) 


at all continuity points of G(u). 
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Now we have 


—є 


б„(ф+е°)= f 4м, (0) + i т за, (и) 


— © 


t fits dM, (и) 4- on + B 10%, (и). (6) 


From (3), (4), (5), and (6) we conclude that as n — оо 
G, (+ оо) > G (+ оо). 
Finally, Ње condition (2) of Theorem 1 follows from the conditions of 


Theorem 2 by the theorem at the beginning of § 9, the preceding results, 
and the formula defining y(r). 


THEOREM З. For the convergence of the sequence F(x) of infinitely divisible, 


laws with finite variances to a limit law and the convergence of their variances 
to the variance of the limit law F(x), it is necessary and sufficient that 


as n — о 
(1) K (ui) => K(u), 
(2) Yn — bt 


where the functions K, (u) and K(u) and the constants Yn and y are defined 
by Kolmogorov's formula (10) of $ 18 for Ё„(х) and F(z). 


Proof. The sufficiency of the relations (1) and (2) for the convergence is 
evident from Theorem 2 of § 13 and (10) of § 18. 

We shall now prove the necessity of the conditions of the theorem. 

Let Р(х) = F(x). By Theorem 1 of $ 13 this is equivalent to the rela- 


tion 
f(t) => FO 
and consequently to the relation 
log f, (0 = | ett — 1— int) e dk, (ш) 
=> log f (t) = iyt 4- | { efut — 1 — iut | ds dK (u). (7) 
We are now assuming that the variance of F,(r) converges to that of 
F(x). From this and from (11) of § 18 it follows that 
K, (+) > К (4- co). (8) 
For every ¢ different from zero we obviously have 


tact | Cera — iut ae aK, (u) 


р | pe 





С (и) (n co). 
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Now let / — 0. The integrals then converge to zero uniformly with 

respect to n.* Пепсе we obtain 
In 1 


Just as in the proof of Theorem 2 of § 9, let us choose from the sequence 
{A,(u)} a subsequence A,,(u) which converges to some nondecreasing 
function X (u) at every continuity point of the latter. 

We now show that 


f (eimi) e aK, (0) m ftri ш) ds AR). O) 


If the function K(u) is continuous at the points —B and +B, then by 
the theorem at the beginning of $9, for a fixed £ and sufficiently large k 
we have 


+в 
| f (etd а) а Ky, (u) — 
—B m 
— f te —1— iut) dR <. (10) 
—B 


On the other hand, 


Ly | f (et — 1 — iut] 2 a, (и) < 21t fz тат Ка, (и) 
[ш> В irs 
«T sup Kn, (+ =) 
and 
L =| f ce 1—iut) Fy 4К(ш)| < cR CL) 
» B 
lu] > 
But А, (+оо) is bounded. Therefore whatever e > 0 and t may be, we сап 
make 


Їр». Мт (11) 


by taking B sufficiently large. (9) follows immediately from (10) and (11). 
By virtue of the uniqueness of the representation by Kolmogorov’s 
formula we conclude that 


К (u) = K (u). 
* In fact it is easy to prove that ей — 1 — itul < 188. Therefore 


[21 


| [ ceri d ; d Ky (u) «4 | dK, (u). 





and the multiplier of |4, 2 on the right side is bounded. 
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Taking into account (8), we conclude also that 
Ky, (и) => K (u), 


i.e., every weakly convergent subsequence K,,(u) converges to K(u). 
It follows that the whole sequence К„(и) converges to K(u); this proves 
the necessity of the first condition of the theorem. 


Part II GENERAL LIMIT THEOREMS 


CHAPTER 4 


GENERAL LIMIT THEOREMS FOR SUMS 
OF INDEPENDENT SUMMANDS 


§ 20. STATEMENT OF THE PROBLEM, 
Sums ОЕ INFINITELY DIVISIBLE SUMMANDS 


The most general statement of the problem of the nature of limit 
distribution functions for sums of independent random variables can be 
formulated as follows: Let fi, (,..., (n... be a sequence of random 
variables each of which is the sum of a certain number of mutually inde- 
pendent random variables, 


a =n T Eng + Der + ink, 


Suppose that for a suitable choice of real constants A, the distribution 
functions of the variables t, — A, converge to a certain limit. It is asked 
what properties this limit distribution function must possess. 

The problem stated in such a general way does not, however, present 
any real interest if the „x are not subject to some additional conditions, 
since any distribution function F(x) can appear as the limit of the distribu- 
tion functions of the sums 


Bat tenet «++ + a, — An: (1) 

To this end it suffices, for example, for the first term in each sum (1) to 
have the distribution /'(z), the others to be zero with probability 1, while 
A, 15 chosen to be zero. 

Problems in mathematical statistics and theoretical physies which 
reduce in mathematical terms to the study of the limiting behavior of 
sums of random variables call for the same reasonable general restrictions 
that it is necessary to introduce in the present statement of the problem. 
Namely, it is to be remembered that the specific properties of the limit 
distribution functions should be determined by the fact that they are the 
limits for sums of an increasing number of independent random variables, 
such that the role of a single summand becomes vanishingly small as n — œ. 

With this additional restriction (up to now formulated only in a purely 
qualitative manner) the problem just posed received an exhaustive solu- 
tion in the papers of Bawly [1] and A. Ya. Khintchine [58]. It was proved 
by these two mathematicians that every distribution function which is 
the limit of the distribution functions of the sums (1) is infinitely divisible. 
This central result of the present chapter will be obtained in $ 21 for 
the case of summands with finite variances and in $24 in the general 

94 
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case. Its proof is based on another basic proposition in the theory, namely, 
that under the restriction introduced above the distribution functions of 
the sums (1) will approach the distribution functions of the sums 


Eni +5 EU +E, As 


of specially constructed infinitely divisible random variables. The proof 
of this theorem forms the content of § 24. | 

In § 25 various forms of necessary and sufficient conditions for the 
existence of a limit distribution for the sequence of sums (1) are presented. 
We remark at once that these theorems can be used also to determine 
conditions for convergence to any given limit law. 

A more precise idea of the vanishingly small role of individual summands 
in the formation of t, is expressed by the following definition: the vari- 
ables n; are called infinitesimal if 


sup Píl&ilze)0 (2) 
1k < ky EN 


as n — oo for every e > 0. 
We see that for further derivations it is sufficient to require somewhat 


less. The restriction which we shall use is given by the following definition: 


DerFinition. The variables &, are called asymptotically constant if it is 
possible to find constants а so that for every e > 0 
sup P{|En— an| >e}>0 (3) 
Dk 
as n — o. 


LEMMA 1. If the &, are asymptotically constant, then it is possible to take 
ank = Myx їп (3), where m, is a median of the variable £u, that is, a number 
such that 


1 [р б 
Р len > ты) Ras Р{ та) 2 ту" 


Proof. We remark first of all that if the probability of £ lying in some 
interval is greater than 3, then every median тё belongs to this interval. 
Now let £4 be asymptotically constant. It is sufficient to prove that 

sup | My, — An, | — 0 
І<А< Е 
asn— 00. 
For every є > 0 we can find n(e) such that for n > n(e) 
sup P (lbs —a ici 


1<К<К„ 


* There сап be more than one number which satisfies the inequality; under 
median we mean any one of them. 
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that 1s, 
1 


inf Р |216) > 5. 
1<k<kn 


Hence by the remark made at the beginning we must have 


sup [Maa аа <e 
1ї<җК< 


for n > n(e), which is what is required. 


Lemma 2. The variables Enx are infinitesimal if and only if 


$ dF, 0 4 
‚гю, I rz Рах) > (4) 


as n oo, 


Proof. (4) follows from (2). In fact, if 0 < e < & and if n is sufficiently 
large, 


su —— dF, (x su ) х? dF,, (X 
EE i+ x? nk ( )« S RE nx ( ) 


+ [| ад) е зар, РП) 


|@loc 


Conversely, db virtue of the inequalities 





(2) follows from (4). 
We can now write down the condition for the variables ną to be asymp- 
totically constant as follows: 


ә х? 
1 ee kn | i+ x? ағ, (х -+ тһ) —0 
asno. 
For later purposes we shall necd the following: 


THEOREM 1. In order that the random variables Ё, should be infinitesimal 
it is necessary t that as n — œ 


sup . bee (0 — 1 => 0. 


Lokchk, 











+ Transtator’s note. The condition is also sufficient, as will be needed later. The 
proof follows from (2) of § 14, by taking 7 arbitrarily large. 
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Proof. In fact, for every e > 0 





зир La (9—1]—, sup, | f (e — D dan C) | 


1« < n E 


< sup f E п [ dF Ay (X). 


1<А< Ка viz. Sk Sin [а> є 





Or, taking account of the inequality |e* — 1| < Jz!, valid for real z, 


sup [лк (0 —1| «& e|t|4-2 зир Р [22е 


lok chy 1xk«k, 


Hence if the £,4 are infinitesimal, we immediately obtain the required 
conclusion. 


$21. Limit DISTRIBUTIONS WITH Fintre VARIANCES 


In this section we shall consider a double sequence 








bib M ERE "nk, 
of random variables which are independent in each row, subject to the 
conditions: 

(a) sup P (к МЕ > E | 0 


1<k<kn 





as n — oo for every є > 0. 
(8) The nx have finite variances and 


о X tu) = = Ўоъ< 





where С is a constant independent of л. 
It is natural that in the study of limit laws for the sums 


On = Eni F Sue o e+ H Enep 4 (1) 


where the A, are suitably chosen constants, and where the variables En 
have finite variances, the case in which not only the distribution laws of 
the sums (1) converge to a limit but also the variances of the sums converge 
to the variance of the limit law, presents the greatest interest. Hence the 
presence and the meaning of the condition (8) is elcar. 

Aside from their independent interest, the results of this section are 
valuable in that they give a clear idea of the proof of general theorems in 
the next section. 
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THEOREM 1.* In order that for suitably chosen | constants A, the distribu- 
tion functions of the sums 


Ce om ene te ck, — An (2) 


of independent random variables subject to the conditions (a) and (B) 
converge to a limit, it is necessary and sufficient that the distribution func- 
tions of certain “accompanying laws" converge. These accompanying laws 
are infinitely divisible and the logarithms of thetr characteristic functions 
are defined by 


kn 
9,4 (1) те? tA,t+ 2, { it M Suk + f (ettz — 1) dF nk (x + Mt) | 3 (3) 
where F(x) ts the distribution function of £y. 


The limit laws for the two sequences of distribution functions coincide. 


Proof. The characteristic function of f^ is 


f (ames TË rato. 


By Theorems 1 and 2 of $ 13, for ihe: convergence of the distribution 
functions of the sums (2) to a limit it is necessary and sufficient that 
asn— оо 


fa (0) => /@), (4) 


where f(t) is the characteristic function of the limit distribution function. 
If we denote the distribution function of the variables £1. = £u — Mt 
by F'{.(x), then (4) can be transformed into the following equivalent form: 
А t+ f tM x 
—і 4 T Eak л ' m 
һ@=е " vU MOD /0. (5) 
Put , 
Ink (4) — l = 4, (0) = ak. 
By (o) and Theorem 1 of § 20 
sup |an] — 0. 
I«k«k, 
Hence, considering an arbitrary but fixed interval of t, we may suppose 
that from some n on, 
1 
su а = 
1<к< p. | nk | < 2 

* Theorem 1 is a slight modification of a theorem of Bawly [1]. 

t Translator's note. The phrase "suitably chosen" is ambiguous here. For any 
gwen An, the distribution functions of (2) will converge if and only if the “ассот- 
panying distribution functions,” which involve the same An, converge. In some 
later theorems it will be shown how the A, are to be chosen. 

ł From this it also follows that from some n on, fa(t) does not vanish in the 


interval considered, and consequently log f,(t) is defined. This remark is needed 
below. 
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We have then the estimate 


Llog fax (0) — ачк |= Пов (1 + an) а | eX 51 Е; 








1 = 1 |a kl? З 
<p фек ар O 


Since 


f x dF nx (x) = М nx = 0, 





the expression for «„ can be written as 
asy = f (6? — 1) аР (x) = f (1 mite) d Fre (х). 
But it is well known that for every real x 
[ettz — 1 — itx| < n 
so that 
[anl <E f n FQ) = Ds. (7) 


Now Бу (5), (6), (7), and (8), we have 


|\о/„ (4) + ил „— X { “М, + [ (е — 1) Е оэ} 


X {ос farh — f 7 0) ағы (х) || 





kn 
pg ; 
< 2 Onn P< ВЕ oul D? ne) < = E Тенк (8) 
From this we conclude that * 
log f, (2) — tn (^) — 0. 


Since е#"® is a characteristic function and consequently does not exceed 
one in modulus, 


v, (2) 
In()—e"™ => 0. (9) 
But (9) is obviously equivalent to the assertion of the theorem. 
Theorem 1 and Theorem 3 of § 17 imply the following: 


Coroutuary. The limit laws for the sums (2) of independent random 


variables, subject to the conditions (a) and (8), are infinitely divisible.. 


* Cf. the preceding footnote. 
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We shall now transform the sum (3) into the usual form for the logarithm 
of the characteristic function of an infinitely divisible law. For this purpose 
we shail consider the function 


k u 
к.) = È f х? аЕ к (x). (10) 


Obviously, the function K,(u) is nondecreasing and satisfies the condi- 
tion K,(—%) = 0, and by (8) K,(+2) is bounded. 

It is easily calculated that the function y,(¢) can be rewritten in the fol- 
lowing form: 


ka 


Pa (D= — it, (У Mim) + [emi — im) d dK, n. (11) 


k=1 


Remark. We remark that the variance of the sum (3) and of the infinitely 
divisible law defined by (3) are equal. 

In fact, according to (11) of § 18 the variance of the infinitely divisible 
law (3) is 


kp Kn 
Ky (+ со) = х єх, (x + Mink) = 2, D’ in = D'c,. 


The theorem just proved, together with Theorem 3 of $19 of the 
preceding chapter, enables us to establish conditions for the existence of 
limit distribution functions for sums of independent random variables 
which satisfy the conditions (a) and (8). 


THEOREM 2.* In order that for suitably chosen constants A, the distribu- 
tion laws of the sums 


On = Eni F one + t e Enx, Ап 


of independent random variables satisfying the condition (а) converge to 
a limit, and that the variances of these sums converge to the variance of the 
limit. law, it is necessary and sufficient. that there exist a nondecreasing 
function Ki such that 


К, (4) => K (u). 


asn — £, where Kt is defined by (10). 





The constants A, may be chosen according to the formula 


n 


A,— Мы — 14-0 (1), 


k=1 
where y 18 any constant. 


* B. V. Gnedenko, [37], [41]. 
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The logarithm of the characteristic function of the limit law is given by 
Kolmogorov's formula (10) of $ 18 with the constant y and the function 
K(w) just defined. 


Proof. From the condition that KQO => А (н), it follows that K, (4-00) 
k 
= > D?£,, is bounded: thus we find ourselves under the conditions (a) 
k-1 
and (8). By the preceding theorem and the remark after it, we may confine 
ourselves to finding conditions for the existence of a limit law for infinitely 
divisible laws, the logarithms of whose characteristic functions are given 
by (11). As we know, this Hmit will also be infinitely divisible. Our theorems 
are obtained from (10) and (11) as immediate consequences of Theorem 
3 of $19. In fact, the condition K, (н) = К(и) of the present theorem 
coincides with the condition (1) of Theorem 3 of § 19. The condition (2) of 
Theorem 3 of $ 19 ean be written as follows: 


k 
n 
n= Ant Ñ хаб) әт — (029) 


where y is a constant, determined by Kolmogorov's formula for the 
limit law. From this relation we see that 2, can be chosen as indicated in 
the theorem. 

Remark. For the convergence of the distribution laws of the sums 


Cn = batt Eno ue, t+ 5 


to a limit, and the convergence of the variances of the sums to the variance 
of the limit law, it is necessary and sufficient that besides the condition 
K,,(u) = К(и) of Theorem 2 the following condition also be satisfied: 


ky 
> [хаР 0) 1 (п — со). 
К=1 


It should be remarked that the preceding theorem gives not only 
conditions for the existence of a limit law but also conditions for the 
convergence of the distribution functions of the sum to any given limit 
law defined by the constant y and the funetion К(и) according to Kol- 
mogorov’s formula. 

As an application of the general theorem just proved, we consider the 
conditions for convergence to a normal law and to a Poisson law. 


THEOREM 3. In order that for suitably chosen constants An the distribution 
Junctions of the sums 


C, = Ên HEr t кү F Enen — An 
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of independent random variables £u, Ene, ..., Enk, converge to the normal 
‚ law 
1 2 21 
Ф (x) = == e ? dz, 12 
(x) Y (12) 


and that the variances оў ¢, converge to one and that the variables £y — МЕ 
be infinitesimal, it is necessary and sufficient that for every e > 0 the 
following relations hold: 


(0 2 f x? dF ny (x + Mi) > 0 (n — oo), 
= le]pe 
kn 

@) ОХ f PtP a+ M> (n> o), 
~ jæj<e 


where Р(х) ts the distribution function of Enx. 


Proof. From the first condition of the theorem we conclude that the 
variables £n — M£, are infinitesimal. In fact, for every e > 0, we have 


sup РОМ |2) sup | ағ, GE MS) 
k< k, ILIA 

<a sup | хаР, (x -- Minn) э 0 as n> оо. 
аре 
The relations (1) and (2) together prove that (8) holds. 

Furthermore, we know that for the law (12) 
0 for u « 0, 

K (и) == f = 

{1 for «> 0. 


In our case the condition K,(u) = К(и) of Theorem 2 can be written 
as follows: 


X 


1790, 


(1) n © T | 0 for 2 <0, 
к= j^ Fak(x t Мы) —> 1 for uw 0, 
ka 

(2) У f taF G4 М) > 1. 


=1 
It is casy to see that these relations are equivalent to the conditions of the 
theorem. 


Particular case. The theorem assumes an especially simple form if we 
suppose that for all n 
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Under this condition, for the convergence of the distribution laws of the 
sums (1) to the normal law (12) it is necessary and sufficient that for every 
є> 0 


ka 
È f ағы + Mb) +0 (n> oo). (13) 
Тарр 


If, moreover, for all К and n 


Mt, = 0, 
then (13) can be rewritten as 
k 
7 
2 f x?dFak(x)—>0 (n> co). 


=Tral>e 


The last relation contains the following 





THEOREM 4.* Let 
Буз bo, +») Зея P 
be a sequence of independent random variables and let the distribution 
function of & be F(a). 
In order that the distribution function of the normalized sums 


У (Er — М) n 
(== 5 (B, = XD'&) (14) 
n k=1 








converge to the normal law (12) and that the summands be infinitesimal, 
it 1s necessary and sufficient that Lindeberg’s condition VAL M 7 
п 


LM f ағ, мы) 0 а 
Bn ш, [212 eB, 


be satisfied for every e > 0. 


In the paper [9], S. N. Bernstein offered another derivation of this 
result of Lindeberg and Feller, proving that it is а consequence of the 
classical theorem of Lyapunov. Moreover, S. N. Bernstein proved there 
that Lyapunov’s theorem gives а neeessary and sufficient condition for 
the convergence of the distribution functions of the sums (14) to the 
normal law (12) under the additional requirement that the moments of 
order 2 + 6 of the normalized sums should converge to the corresponding 
moment of the normal law. 


* Lindeberg [77], W. Feller [27]. 
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Tueorem 5.* In order that for suitably chosen constants A, the distribution 
functions of the sums 


Cyt audeo An (15) 


of independent infinitesimal random variables ёк (1 < k < kn) converge as 
n — со to the Poisson law 


P(yo MX fu (16) 


0<т< = 


and that the variance of the sums (1) converge to №, it is necessary and 
sufficient that for every e > 0 


kn 


х? dF iy (x + Minx) > 0, 
kw tpa 
kn 
[ x? dFnk (x + Mënz) <> h, 


k=1 
jel} ce 


where Fe) is the distribution function of Enr. 
The constants A, may be determined from the equation 


Proof. Theorem 5 follows readily from Theorem 2 and the fact that for 
the law (16) y =A, KQO) = О for « € 1 and Ku) = А for u > 1. 

Just as in the case of the normal law, the conditions of the theorem 
assume the most simple form if we suppose that for each n the variance 
of the sum £, is equal to the variance of the limit law, i.e., if 


Under this additional assumption, for the convergence of the distribution 
functions of the sums (15) to the Poisson law (16) it is necessary and suffi- 
cient that for every є > 0 the following relation be satisfied: 


2 f. хағ,„ М) +0 (п эо), 


k=1 
le—l}>e 


* D. V. Gnedenko [41]. 


§ 22] LAW OF LARGE NUMBERS 105 


§ 22. Law or Lange NUMBERS 


We shall begin the study of the limiting behavior of sums of indc- 
pendent variables in the general case with the law of large numbers. The 
conditions found here form the basis of proof of subsequent theorems. 


DEFINITION 1. The sequence of random variables 
E ану Буру» 
converges in probability to the random variable £, if for every є > 0 
Р (14—122) > 0 as п оо. 


In what follows we shall denote convergence in probability by the 
symbol 2». The preceding relation can then be written as 


t2 (n > со). 


DEFINITION 2. The sequence {£,} is called stable if there exists a sequence 
of constants {A,} such that as n — oo 


in 4. 0 0. (1) 


Repeating the argument carried out in § 20 (Lemma 1), we see that 
if the sequence Í£,] is stable, then the constants A, in (1) may be taken 
to be the medians m,. In other words, it follows from (1) that as n — oo 


En — My 0. 
DEFINITION 3. The double sequence of random variables 
Ents Eng, ZI (021,2, 3, ...) (2) 
obeys the law of large numbers if the sequence of sums 
Ca = Eni H na E o e H Ene, 
is stable. 


Our next problem consists in the determination of the most general 
conditions under which the law of large numbers holds. 


TukonkM.* In order that the double sequence (2) obey the law of large 
numbers, il 15 necessary and sufficient that as n — oo 


k 
) € f aF (eH mn) 0, 
k=1 жы 
Iz| > (3) 


2 > f х? dF ay (X та) >0. 


* See [63], [28], [41]. unl > 010% 
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Proof. The sufficiency of the conditions of the theorem can be derived in 
a completely elementary way. For this purpose we introduce the notations 
jx == fpr — Maks 
Fix (4) = P (E x) == Р, (et ma). 
Furthermore, we put 
Eu. BE 


fk =) o, if IE 1 


and 
ky 
A, = 2 (ть HM Mt), 
where 
М = È xdFa (x). 
{т|<1 
Let 


kn ky 
б = > bn fn = У 
kei К 
and let B, Бе the event that t/, = t//. If B, is the event complementary to 


B,, then it is evident that 


ky 


kn ' , 
Р[8,) < ЎР(%1>1)= f а). (0) 


^ 2l»1 
Obviously, for every e > 0 
P (15, — Anl > E) =P({B,}P{[Sx—Anl >e | Bn} 
+ P(B,]P (IG —A4| >| Bn}. (5) 
Since 
P{|Sn—An|>€|Bn}P( B4] <Р ([Sn—Monl De}, 


according to Chebyshev's inequality 


P{[Sn—Anl>e]Bn}P{ Bp} <5 Dy 


Kn ky kn 
1 ГД 1 ГА 1 , 
=z MD УМ =}, | zara). (6) 
k=1 kei k—1|z|«1 
From (4), (5), and (6) taken together, we find that for every e > 0 


M" ky 
Pli—Anl>el<e df game | аға), 
k—1|mi«1 k—1l|mi»1 


which proves the sufficiency of the conditions of the theorem. 
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To prove the necessity of the conditions of the theorem, we make use 
of the apparatus of characteristic functions, for elementary methods lead 
to very cumbersome arguments.* 

If the law of large numbers holds, then for some A, 


On — An д 0 


as n — oo. In other words, as n — со the distribution function of the sum 
En — A, converges to the unitary law e(r) or, in terms of characteristic 
functions, 


k 
e Ant П (1) => 1 (n> оо). 
к=1 


From this we conclude that as n — oo 


kn 
Miao. (7) 
From (7) we deduce in particular that as n — oo 


„зыр (1—1 (01) => 0. (8) 
id n 
According to Theorem 3 of $14 f this implies the asymptotic constancy 
of the variables £j. Thus, if the sums f, of independent random variables 
are stable, the summands are asymptotically constant. 
From (7) and the inequality 


—log(1—2a)za (б<а<1), 


we easily find that as n — oo 


kn 
20 s (010) — 0. (9) 


We shall first prove the necessity of the condition of the theorem in the 
particular case that all the summands ё, are symmetrical. А little later 
we shall reduce the general case to this particular case. 

Let f(t) be the characteristic function of the symmetrical random 
variable #. Hence mé = 0 and 


f (t) = f cos tx dF (x). 


We have 
1 
реалан | (1-2 )aroo. 
—1 


* See [63]. 
t Translator’s note. This is not sufficient. We need the amended Theorem 1 of 
$ 20, trivially modified for asymptotic constaney. 


108 GENERAL LIMIT THEOREMS FOR SUMS [CHAP. 4 


Now for |z| > 1 








sin x 1 
| x 7 
and for |z| € 1 
] — Sinz 21 
Thus 
1 
fassa} | eR f ак 
—1 21<1 21р 1 
and so 


1 ka k 1 
f Ха-љо0а= Хала 
mp —1 


k 


n ky 
I 
?12 L adaty x f dF „y (x) > 0 
k=1 k= |z|»1 


From this and (9) follows the necessity of the conditions of the theorem. 
[Remember that in the symmetrical case т, = 0 and /,(0) is real. By (8), 
from some n оп, [/,4(t)| = falt throughout the interval || € 1.] 

We turn to the proof of the necessity of the conditions of the theorem 
in the general case. Consider the random variables т, independent among 
themselves and of all £j for each n, and such that &. and nas have the 
same distribution. 


Put 
* 
Ene = Enk — Nnk 
and 
kn 
* * 
On = 2 nk 


The variables f; are symmetrical. Their characteristic functions are equal 
to |f.) P 

Consequently, by (7), ¢% converges in probability to zero. 

By what we have proved this means that as n — oo 


v 
f а(х) > 0, 
SI да 
kn 
i [ dF’, (x) — 0. 


12|21 


Hence we obtain (3) with the help of the following lemma. 
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Lemma. Let the random variables £ and т be independent and identically 
distributed, and the function s\(v) be defined to be 


-[t for |х|<1, 
ene 1 for |х|]>1. 
Then 

Ms, (E—1) > Ме, (т), 


where m ts the median of £. 


Proof. Without loss of generality we may suppose that m = 0, since 
otherwise we may consider the random variables £ — m and у = m. Obvi- 
ously, if x and y are of opposite signs, si(r — y) > sr). 

Therefore 


мз) = | f 5,(к—у)аР(х)ак cn 


> || + ff sna wary) 


{ 230, у<0 у {®<0, y20 у 


> Í aF (y) | s, (x) dF (x) -|- f ағо) | s, (х) dF (x) 


у<0 т>0 y>0 2<0 
1 
> | s, (x) dF (x)= 1 Ms, (0). 


Remark. Obviously, with the help of the function 8 (х) the conditions 
of the theorem can be written as follows: as n — oo 


2 Ms; (&,, — m,,) > 0. 


Since 


1 ES. <s 1(х) < 2ге T+ x? , 
it is evident that the conditions of the theorem are also equivalent to the 


condition that as n — oo 
k 


d (en тһ)? 
> М .——————5 > 0. 1 
1 + nk — Mnk)? 0 ( 0) 


§ 23. Two AUXILIARY THEOREMS 
We shall now turn to the general problem of determining the limit 


laws for sums of independent random variables. We confine ourselves to 


considering a double sequence 
Еа m S nk 


of random variables which are independent in each row and asymptotically 
constant. 
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We begin the investigation of this problem with the proof of two 
theorems which show that even in the general case, where we can no longer 
require the existence of variances for the summands, there is a valid in- 
equality analogous to the condition (8) of § 21. This circumstance enables 
us to retain the idea of proof of the theorems of § 21 and to extend them 
to cover the general case. 


THEOREM 1.* If for some suitably chosen constants A, the distribution 
functions of the sums 


On = Eni t ona t Ж; + Enen — An 
of independent random variables Ёк converge to a limit, then there exists 
а constant C < oo such that 


ky 
ч 2 
X | atte eem) «c а) 


Proof. This theorem, which is basic for all that follows, is a simple 
consequence of the results of the preceding section. 
By hypothesis, the distribution functions of the sums ¢, converge to a 
limit as n — oo. Hence it follows easily that for every sequence of constants 
ay, —> 0 
we have 


а 09 0. 
But for 0 < a < 1, 
M us = OM т>. aM 
Desc OM Dpag2? "qe 
Hence for sufficiently large n (£j, = Enk — maj) 
kn 9,2 k 72 
а^ & R Е 
УМ RA SM 
k=1 1 + алк Ken 


and, consequently, as n — oo 





kn 
wl xi 

У, J raro (x+ may) > 0. 
k=1 


If the sums (1) were not bounded, then the preceding relation could 
not hold for every sequence a, — 0. 

Remark. It is possible to give another proof of Theorem 1 without 
making use of the results of the preceding section. In this respect it is 
interesting that further results, among which the law of large numbers 
is also counted, can thus be obtained by a single method without using 
the results of § 22 (see В. V. Gnedenko [41]). 


* For another proof see B. V. Gnedenko [41]. 


§ 23] TWO AUXILIARY THEOREMS 111 


THEOREM 2.* If for some suitably chosen constants A, the distribution 
functions of the sums 


On = tni F Eng F oes t+ Enin — An 


of independent infinitesimal random variables converge to a limit as n — оо, 
then there exists a constant C such that 


п 
х? y 
у | 1 x oF nk Gc F 244) <C, 
k=1 
where 


Ank = f x dF д (x), 
{2|<т+т 
and т is any positive constant. 


Proof. We note first that since the variables nx are infinitesimal the 
following relations hold f 


sup |[m,,| > 0 (n оо), 
1<k<k,, 

sup |a,,| > 0 (п оо). 
Ike, 


Because of the elementary inequality (а +b) < 2(a?+ 0°), we con- 
clude that for sufficiently large n 


А — 2 
| га dP nn (e+ au) = | нш Fan Qd т) 
2x? dF, 2 
< f 1 F carer | 2(m, — Ink)? 
«c Inm T+ x2 dF, (x 4- т) + 2 (my — tnk). 


Now we estimate the difference (Mak — an)%. We have, for sufficiently 
large n, 


(anr — maj)? = ( f (x — maj) аР, (x) — f т, AF nx (X) ) 
Ixi <t \т|>т 


2 
«(є Le ld Fan Gr maj) 
| &-kEmak | <7 


is ( J my OP yy (x -+ ma) « (cont'd) 


(T+mnk | >T 





* B. V. Gnedenko [41]. 
t The second relation is derived as follows: 


laxi-| f xd Fm eo) f а CO PCIE m2 
саг aS e +TP Ce 2:6). 
Choosing e > 0 sufficiently small, and n sufficiently large, we can make sup, |а] 


as small as we wish. 
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Lo f |x| dF yx (e+ maa) ) 


| z1«2* 


+ 2n?,( f dF yx (x + ms) 


z 
Iz|2- 


2 


Finally, applying the inequality of Cauchy and Bunyakovski, we find 
that 


(ank — mg)? «x 2 Í x? dF,, (x + may) + 2m? f dF, (+ mg). 


| zx TII 


From the inequalities obtained and from Theorem 1 follows the assertion 
of Theorem 2. 


$24. Tux GENERAL Form or THE LIMIT THEOREMS. 
THE ACCOMPANYING INFINITELY DIVISIBLE LAWS 


THEOREM 1.* In order that for some suitably chosen constants A, the 
distribution functions of the sums 


On = Snr F ona E een Su, An (1) 


of independent infinitesimal random variables converge to a limit, it is 
necessary and sufficient that the infinitely divisible laws,t the logarithms 
of whose characteristic functions are given by the formula 


ky 
bn (0 — iA, У (itant f (eM 1) аР а). 0) 
converge. Here ERE 
Ank = f x dF yy, (x), (3) 
121 <= 
and т> 0 is a constant. The limit distribution functions for the two 
sequences coincide. 


Proof. In order that the distribution functions of the sums converge 
to a limit, it is necessary and sufficient that 
kp 
—itA 
fe I A o—f0 >o (4) 
=1 


where /(t) is the characteristic function of the limit law апа f,(£ is the 
characteristic funetion of the sum (1). 
We introduce the notation 
Fok (x) = Far (x+ алк), 
where æn is defined by (3). 


* B. V. Gnedenko [36], [37], [11]. 
T We shall again call them the accompanying laws. 
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Then (4) can be written in the following equivalent form: 
kn 
—itA, it is 


e HE: "n ГАО, => f(t) (n > co). 
Set 
f» (t)—1= Pnz; 
since the variables n, are infinitesimal, 
sup | Bux | > 0 (п >c), (5) 


«kk, 


we can make use of the expansion of the logarithm in a series: 
А 1 4» 1 $43 
log fax (£) =a log (1 + Box) == блк * gx +з Bak — ... 


Hence, for poe large n, we find 














v1 

Y log fak (f) -$ Bar| < 55 s | Bak |s 

k= k=1 k=1 9=2 

DN bat 
<7 У жа кт Se | Bax Y Вак |. (6) 
Мох 
| f iiaa FE — narco 
|r) <t \т| >т 


4-й f xd Fan CO | <3 leh f x! аРьк (x) 


IT| «t Iz|«t 


+2 f amen] fosa. — 0 


{тү>т || e 
Now we estimate 


f x dF nx (х). 
(21 < 7 


We have for sufficiently large л (such that || < 5) 


| хава (х) — | хар (х)|< [| |x | dFnx (x) 


EIE EALES cial S 


4< 
гасы | dF nx (x) 
121> 5 
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and 
| f ао | f соода 
PPP Ib zx 
= |Ank | dF. Go] < < f dF nx (x). 
[21> т jzl>— 
2 
Therefore 
| f хағ |2 f arc). (8) 
ї2|<т Iv 


To prove the necessity of the conditions of the theorem we notice 
that by Theorem 2 of § 23 if the distribution functions of the sums (1) con- 
verge to a limit, then 


ka 

2 , 
р | ту 4Р (х) < C. (9) 
= 1 


We have, consequently, 
k 








n ky, 
Јао cate у | rEadFaQ)« (1-Ет)с, 
k=1 |\т|<т1 к=112[47 
иа Xy, А 
Ў, f dF n (х) < i Гаа) < + 
k=1 |2zj> k=1 |aj>r 


Therefore, by virtue of (7) and (8), 
kp, 
ies Гари? 11044 т? 
Ува (82 639 o EEN с 
k=1 


where C is a constant. Thus it follows from (5) and (6) that as n — © 
ka ka 
| log fn (t) X Vs (4) | = | = ПА, =F it dm + eg fun (9] 


ky ka 
== [— iA, + it m + з) | 


k 
n 


ky 
= | los fun (0— È Bal => 0. 
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Since eO is a characteristic function and hence cannot exceed one in 
modulus, we have 
Фф) s 
fal — е k EIS 0, (10) 
which proves the necessity of the conditions of the theorem. 


To prove the sufficiency of the conditions of the theorem we have also 
to estimate the sum 


k 

n 
> | Bax |. 
k=1 


For this purpose we note that for the infinitely divisible laws defined 
by (2), we should put 








kw | 
а, (u) = Y | mm nk (x) 
К==1 —оо 


in the formula of Lévy and Khintchine. 
Hence, if the infinitely divisible laws defined by (2) converge to a limit, 
then by Theorem 1 of § 19 


ky 

| dG, (и) = V [тта Tg dF (x) f dG(u)  (n— o0), 

where G(u) is the monotone function defined by the formula of Lévy and 
Khintchine for the limit law. 

Thus we also have the relation (9) if the functions (2) converge to a 
limit. Therefore from (5) and (6) we again conclude that (10) holds. The 
proof of the theorem is thereby completed. 

The theorem just proved is of considerable interest, since it permits us 
to replace the investigation of sums 








Ca = Su bno + + nky 7 An 


of infinitesimal random variables £,. with, generally speaking, arbitrary 
distribution functions F(x) by an investigation of sums 


Bs = + Ene кен + Enkin — An 


of infinitely divisible variables £4. This circumstance, as already stated 
in § 20, is made the basis for the exvosition of theorems concerning the 
limit distributions for sums of independent variables. 

As a first consequence of the theorem, we cite the following fundamental 
result, obtained by A. Ya. Khintchine [58]. 


THEOREM 2. In order that F(x) be the limit distribution function of sums (1) 
of infinitesimal random variables which are independent in cach row, it 15 
necessary and sufficient that F(x) be infinitely divisible. 
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Proof. From the preceding theorem we know that the limit distribution 
function for the sums (1) is simultaneously the limit of infinitely divisible 
laws and so according to Theorem 3 of § 17 is itself infinitely divisible. 
The converse proposition, that every infinitely divisible law is the limit 
law for sums of infinitesimal variables, follows readily from the definition 
of infinitely divisible laws. 

Thus we have proved that the class of limit laws for the sums (1) of inde- 
pendent infinitesimal random variables coincides with the class of infinitely 
divisible laws. 

If the £4 (1E kx k,,n21,2,...) are asymptotically constant 
random variables, then the variables £j, — nt, are infinitesimal. Therefore 
the class of limit laws for sums (1) of asymptotically eonstant variables 
(not only of infinitesimal variables) coincides with the class of infinitely 
divisible laws. 


§ 25. NECESSARY AND SUFFICIENT CONDITIONS FOR CONVERGENCE 


Theorem 1 of § 24 enables us to find conditions for the existence of a 
limit. distribution function for the sums (1) of § 24. 


THEOREM 1. In order that for some suitably chosen constants A, the 
distributions of the sums 


On = 6а Enot e+ б — An (1) 


of independent infinitesimal random variables converge to a limit, it is 
necessary and sufficient that there exist nondecreasing functions 


M (u) (M (— oo) =0) and N (u) (N (+ оо) = 0), 


defined in the intervals (—0c,0) and (0,+0¢) respectively, and a constant 
o > 0, such that 


1) Af every continuity point of M(u) and N(u) 


kn 
lim $ Fa, (и) = M (u) (u <0), 
n>ok=1 

Ky 
lim >) (Еби) — 1) = М (u), (и > 0), 
п оо k=l 


2) lim lim xí f х®аЁ nk (х)—( [ xdF yx (х)]\ 


600 RG wD k= [а [< IE SE 


ЫЛ 


—lim lim У | [ хаР, (x) — ( [ x dF, (x)) | == ®. 


в 02% ә со k—1 [al <& 121 < 
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The constants A, may be chosen according to the formula 





where y(r)* is any constant and —т and +r are continuity points of M(w) 
and N(u). 


The logarithm of the characteristic function of the limit law is defined by 
the formula (7) of $ 18 with the functions MQ), NOGO and the above 
constants o, y(r). E е1 


Proof. The theorem formulated above is a consequence of Theorems 2 
of § 19 and 1 of § 24. In fact, by Theorem 1 of § 24 we may confine ourselves 
to the investigation of conditions of convergence of infinitely divisible laws 
defined by (2), § 24. For these laws we should take in (8) of § 18: 


ka u 
M, (4) = > Гар, (x + 244) for u <0, 
k=1 
ky оо 
№, (0) =— X J аР, (4-а) for a> 0, 


9,— 0 and yal) = — Ant b [ хаР, (х) + У [ x dF), (x). 


1а] ех = Males 


According to Theorem 2 of § 19 for the convergence of the distribution 
laws (2) of $ 24 it is necessary and sufficient that as n — oc 


1’) X f dF,y(x--a,4) — M(u) (и « 0); 


ky œ 
— D f aF) NU) (u> 0); 


ka 0 € 

2) tim tim X | f tari (x) + [еар (а) 
$0 ny oo =1 —є 0 

ky +e Ky +e 


—lim lim У) ES dF „x (x) = lim lin У) еа 
2-50 н->соК==1 ` ae £0 nyo k=1 


юл Ў [жабыш Ў f хаР) 10 
Ligier [z1<t 


* Translator’s note. The argument 7 apparently serves to recall formula (8) of 
§ 18. 
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From 1’), 2’), 3’) we deduce first of all that A, may be chosen as indi- 
cated in Theorem 1. d this purpose we shall prove that 


X f хар (x) > 0 (n — оо). (2) 
Malet 
We have 


X f sdra, X j| ann) Fe) 


1а! <т l аа ult 


kn 
S [ хаРь(ху— f xdFa(x)tan f dF nx (х)). 


1g—a, pl <T jac {т| >т 
According to Theorem 2 of § 23 there exists a constant С such that 


ka 
, * 
2 f ағы (х) «c 
дет [арт 
From this, taking into account that 
„= sup |¢,,/ 270 (n> co), (3) 
1<k <k, 


we find 


l. X ank f а (x) | « Cl, 50 (n co). 
25 
On the other hand, for sufficiently large n,t 


k 
У foo sare. f sare) | 


[Tray |t Га1< 


ky 
«ài fosa f хаР„ (x) ) 


ITETISI Ing! Iz—t|«le,k | 
ky 
O 4 jl 
<È f daot o [f ^ Fa. 
к (THIIS ag | [z—:1«212,5 1 


* Translator’s note. In order to prove the equivalence of 1’), 2’), 3’) and the 
kn 
conditions 1), 2) of the Theorem, we need the inequality > dF (1) < С 
#=1\г]>+ 
under each set of conditions. This inequality follows easily from either 1’) or 1), 
using the fact sup lane! — 0 (n — œ) deduced at the beginning of the proof of 
1X kE ks 


Theorem 2 of $ 23. 
t Translator’s note. The original formula after this sentence is incorrect and is 
corrected here in one of the possible ways. 
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The last member of this inequality approaches zero, by 1^, formula (3), 
and the fact that —т and +7 are continuity points of the functions M (u) 
and N(u). 

Thus (2) is proved. 

Furthermore, at continuity points u of the function M (i) 


ka иі 
= У Јака) М(и) (поо), 
ka LAM 
„=, f 4FaG-cag—M() (поо). 
Now =" 
y 
Ix Xf dF n(x) «T, 


and consequently, 
ka u Ln 
= f dF, (x) = 2. Fix(u) > M (u) (пә co). 
In exactly the same way it is proved that for u > 0 


kn œ ka 
— У f ағ, (х) = Ў (Еш) — 1) N (u). 
k—i; k=1 


We have thereby proved that 1’) implies 1). The converse is proved just 
as simply, and we shall not pause for it.* 

It remains to establish the equivalence of 2) and 2’). 

For this к we note first of all the following obvious relation: 


lim im $ f x* d Fay (x + an) 


000 n> д1 12 <= 


= 


n 
= lim lim > f x? dF, (x + 244) 
k= 


£0 n> со 1 
і + ді <є 


Kn 


== іт lim = f (x — Ong)? ЯР, (x), 


e>0 n co ke-ligice 


lim lim У f x? dF yy (x + аһ) 


= Ит lim Ў [ (х—а«„„)*аЁ„һ (x). 


є->0 n- co k=l x| ce 


* See the translator’s note just before (8). 
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Furthermore, 
kn к„ 
Sf aenda f sara 
k=l |p| ce k=l |æl<e 
—?в„ f хар, (х) Ба | aF 0n ) 
mi «e Izi2t 
i: а 
ыз У | f хаР, 0)—-( f xdFy(x)) | 
К=1 |штурсе |zi«t 
kn 
rÈ, ауа f fme 
k=1 \к<рту<т 


We denote the second sum of the last member of the equality above by 
Qn. 


Jt is easy to sec that 
k 
n 
lonl < (72. sup f dF pr (х) +). У f dF ny (x). 
ISES ka |, k=l |а| 


The first factor approaches zero as n— oc. The second factor is 
bounded.* In fact, if the distribution functions of the sums (1) of § 24 
converge to a M then by Е 2 of § 23 


S | dF „к (x) < y || аР, (x + ank) 





k =1 |æ |>e elei 
E E 
Е Е 
< ez | 1 Pes dF (x -| IX алк) < С. 
k=1 e 
1212 5 


If, on the other hand, the conditions of Theorem 1 are satisfied, then, 
whatever 6 > 0 is, for sufficiently large п 


3 p Sie M ses с. 
к= [арр 
Thus in both cases 


a, 0. 


We have thereby completed the proof of the equivalence of the conditions 
1’), 2), 3) and the conditions of the theorem. The theorem is proved 
completely. 


* Translator s note. This follows also from 1’). See the preceding note. 
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Remark. Theorem 1 of $ 19 and Theorem 1 of § 24 enable us to formulate 
the conditions of existence of a limit law for the sums (1) of § 24 in another 
way. 

In order that for some suitably chosen constants A, the distribution laws 
of the sums (1) of infinitesimal summands converge to a limit law, it is neces- 
sary and sufficient that there exist a nondeereasing function G(w) of bounded 
variation such that 


р ее => G(u) 





asn— o. 

The function G(u) determines the limit distribution function according to 
the formula of Lévy and Khintchine. 

It is obvious that the results obtained above ean be carried over auto- 
matically to asymptotically constant summands. We shall confine our- 
selves to formulating one theorem which is almost a literal repetition of 
Theorem 1. 


'TukonEM 2. In order that for some suitably chosen constants A, the distri- 
bution laws of the sums 


Cs =S tni tone H ee Ts, —A, 
of independent asymptotically constant random variables converge to a 


limit, it is necessary and sufficient that there exist functions M (u) and 
N(u) and a constant o such that: 


1) At continuity points of the functions M(w) and № (и) 


к ц 
2 f dF yy (x ma) > M (u) (n>) (u< 0), 


oo 


ky, œ 
D J 4P (т) o Ми) (n> со) (и> 0); 


2) Mm lim К $i f хаР (хта) -( f saF oma) 


Е» 0 n> 2, [| < е [rice 


== lim lim | f x? d F(x-- m4) — ( f хаР„(х-Ет„)) | = ©. 


є->0 "noo К =1 2|<= іх |< є 


The constants A, may be chosen according to the formula 


ka ka 
A,— X | xdFa т) + Y ma—10, 
k=l 


Крест 


where y(r) is any constant. 
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The logarithm of the characteristic function of the limit law is given by the 
formula (7) of § 18. 


This theorem is a consequence of the preceding one, since the variables 
Enk — Max are infinitesimal. 
We consider the particular case of the last theorem when 
— kn 2 
lim lim $ ( f x dF yy, (x + т„)) == 0. 
є->0 n> р —1 |21 <е 
This circumstance takes place whenever the limit law does not have а 
normal component (с = 0) or whenever the variables ё are symmetrical 
with respect to the medians. 


THEOREM 3. If 


k 
lim lim *( f хағ,, (та) = 0, (4) 
є->0 n> сод BET 
then for the convergence (for suitably chosen An) of the distribution laws 
of the sums 


„== H ne H к, An (1) 


of independent asymptotically constant random variables to a limit, il is 
necessary and sufficient that there exist a function G(u) such that 


k и 
п 
x? 
ш = Y | edhe Ge ma) => 00и) 
k=1 —со 
аѕ п — ©. The constants A, may be chosen according to the formula 
kn 
А, = У [ f хар, (x -+ ты) + tnx | . 
К жа1 [тр<т 
Proof. If (4) is satisfied, then, by the preceding theorem, for the con- 
vergence of the distribution laws of the sums (1) to a limit it is necessary 
and sufficient that there exist a constant с and functions А (и) and N (u) 
such that as n — oo 


к 

1) * Fa ma) > M(x) (х<0), 
к=1 
ky 

2) У (Far GM тар) — 1) > N(x) (x > 0), 
k=1 


ky 
3) lim fim У f x? Е, (x ]- ть) 


t0 n o k=l acs 
k 
n 
=lim lim У f x? ағ, (x + т.) == c?. 
12057 о k—ligics 
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If we introduce the function С(и), defined by the equations 
u 


G(u)= | амо) — «0, 


—со 


0 
G(-+0) = 924 амо), 


TF dN(x) (u>0), 
then by Theorems 1 and 2 of § 19 the relations 1)-3) just written down are 
equivalent to the condition G, = G in the theorem considered. 

We shall now prove that if the limit law does not have a normal compo- 
nent (i.e., if о = 0), then 


im in У( є хаР (та) =0. 


$0 7->со > 
їтр<є 


6(9) 6 C4-9)4- [ү 


For this purpose we note that if we denote 4, by that one of the intervals 
(—€0) апа (0,e) for which the integral 


f хар, (x + та) 
A 


has a greater absolute value, then 


kn 


kn 
X( f xf (et mad) « X (f xata e may) 
Iz (<е« — Ak 
«X Í х dF ny (x + mas) f dF py (хта) 
<5 Í dF yx (x + Mak) f x? dF ny (x + Myx). 


і21<& 
But by the definition of the median 


| dF (x + т»к) < 2 , 


Ак 
so that 
kn 2 kn 
| 1 
У( | хат) < E | ағ, от). 
k=1 р, ; REN Shee 


Now the desired relation follows from Theorem 2 and from the inequality 
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kn kn 
ID f ағ оњ) У {лағы от) 
тех k--1 jalce 
2 
-( [ x dF yy (хта) \. 
[z2]<e 


Modifying somewhat the formulation of the last three theorems we can 
obtain not only conditions for the existence of a limit law for the sums, 
but also conditions for the convergence to any given limit law. Let us 
paraphrase, for example, Theorem 1. 

THEOREM 4. In order that for suitably chosen constants A, the distribution 

functions of the sums 

On = ba F Eng F E ras, — An 

of independent infinitesimal random variables converge to the distribution 

function F(x), it is necessary and sufficient that the following conditions 

be satisfied: 

1) At continuity points of M(u) and N (u) 

kn 
2 Fk (x) — M (x) for x « 0, 
ei 


kn 
У (Ра (0) — D NG) for x20 


as n — oo; 
. — kn 2 
2) lim lim 2, ms x dF x iet T хае) | 
= in tin X( f ao f sara) =. 


RARS laj<e 
where the functions M(u), N(u) and the constant о? are determined by 
Lévy’s formula for F(x). The constants A, are determined by 
kn 
п 
A,— Ж f dF ax (х) — 1 (9). 
k=1 
l2p<t 
where y,(7) ts any convergent sequence of real numbers. 


Remark. If it is required to state the condition for the convergence of the 
distribution functions of the sums 


On = baat net --- +5, (5) 


to à limit distribution function, then one more condition should be added 
to the conditions of Theorem 4: 


kn 
lim X f x dF x tx) = (х). 
пъ ә со k=l 


[2|<х+ 


CHAPTER 5 


CONVERGENCE TO NORMAL, POISSON, AND 
UNITARY DISTRIBUTIONS 


$ 26. CONDITIONS ron CONVERGENCE TO Ховмлі, 
AND Poisson Laws 


We shall now make use of the general theorems in the preceding chapter 
to clarify conditions for convergence of distribution functions of sums to 
the various particular limit laws with which the classical theory of prob- 
ability concerned itself. In this connection we shall confine ourselves in 
almost all theorems to the consideration of sums of infinitesimal random 
variables, since the consideration of asymptotically constant summands, 
as we have seen before, can be reduced to that of infinitesimal ones. 
And only in the treatment of theorems of the type of the law of large 
numbers does the very essence of the problem compel us to consider 
asymptotically constant summands. 

The general problems considered in the preceding chapter were raised 
and solved only in recent years. The main interest of classical investigations 
amounted to the clarification of conditions for the convergence of distribu- 
tion functions of sums to the normal law and to the determination of the 
broadest conditions under which the law of large numbers holds. It is 
interesting to note that, in essence, the classical theory of probability 
studied only one proper limit distribution law — the normal law. The 
study of the Poisson law was confined only to elementary investigations. 
The causes for such one-sidedness of classical investigations have been 
completely uncovered in recent times. It turns out that the normal distribu- 
tion law indeed plays a dominating role in theoretical as well as applied 
questions. We shall see below that whereas for the convergence of distribu- 
tion functions of sums of independent variables to the normal law only 
restrictions of a very general kind, apart from that of being infinitesimal 
(or asymptotically constant), have to be imposed on the summands, for 
the convergence to another limit law some very special properties are 
required of the summands. 

We see that the general theorems developed in the preceding chapters 
permit us to obtain, literally in a few words, the proofs of the most impor- 
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tant theorems іп the theory of probability. However, this circumstance 
must not belittle in the eyes of the reader either the value of those theorems 
themselves or that of the efforts spent by mathematicians in the formula- 
tion and proof of those propositions. 

We now turn to the discussion of concrete results, and we begin this 
discussion with the proof of a theorem of A. Ya. Khintchine [59], clarifying 
the fundamental importance of the normal law in the theory of probability. 


THEOREM Wr the distributions of the sums 
Cs = inpr fub ss. T, 


of infinitesimal random variables £ (1 € k € kn) which are independent 
in each row converge to a limit, then the relation 


У f 4Р0) +0 (n> o) (1) 


= 1221 2 = 
ts satisfied for every є > 0 if and only if the limit law is normal. 


Proof. Since by hypothesis a limit law exists, we have, by Theorem 4 
of § 25, 


Kn kn x 
D B Раб) 5 Ја) Mo (x <0), 
2) X (Fa (х) —1) = -X J (0) = Мо) — 629. 


Hence we conclude that if (1) is satisfied, then M(x) = 0, N(x) = 0, 
and consequently the limit law is normal. Conversely, if it is known that 
the limit law is normal, then M(x) = 0, N(x) = 0, and therefore by 1) 
and 2) the relation (1) holds, proving the theorem. 

We have seen that if we impose on the variables £j only the requirement 
of being infinitesimal, i.c., the requirement that as n — oo 


sup Р! 


1<К<Ё„Һ 





inkl ee} 0 (2) 


for every є > 0, then any infinitely divisible law can serve as the limit law 
for the sums 5s. The condition (T) means, ds we shall now prove; not only 
that the individual summands are small, but that they are uniformly 
small. In other words, for every e > 0 the probability that at least one 


of the £4 (1 € k € kn) exceeds e approaches zero as n > oo. To put it in 
a formula, this can be written as 
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P{ sup |, [226] 0 * (п — оо). (3) 
Ikka 


It is clear that 
Pt DP. Т2 є}=1—Р{ UM mss] 


1<К< 
ka kn 
— П Р1<) 1 П а= f afao) 
Ж = [а> 
Hence the condition (3) and 
к 
li (1 — f ағ, (х)) —1 (п—оо) (4) 
-k=1 [т|>є 
are equivalent. 
The inequality 
" k 


n 
— f. акт) 


dF x (х) < « II 1— f ағ, (х) ) < ek=1 igo <1 
Xj (1— f dFu(x))< < 


=l s>. [wipe 


shows that (1) implies (4) and so also (3), and conversely that (3) implies 
(1). 

The preceding theorem has its analogue in the theory of stochastic 
processes with independent increments. The collection of random variables 
ġa depending on the continuously varying real parameter А is called a 
stochastic process with independent increments, if the increments of the 
random variable f in disjoint intervals of the parameter À are independent 
random variables. 


* We remark that the random variables can be infinitesimal without satisfying 
the requirement (3). 
For example, let 


0 for x «0, 
Far (х) = 1—1 for 0 x«l (1 & k « n), 
1 for x 1. 


Then for every e (0 < є < 1), as n —^ oo 


1 
su P =—->0 
ТУТ hal) n^? 


but at the same time 


k 
P( sup |&mxl>e}=1— [[ Pl Eni<e} 
k«k К=1 


1<К< 


1\n 1 
=1-(1-) > 1—— (n> оо). 
п e 
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We shall say that the stochastic process £4 is stochastically continuous, 

if for every e > 0 
Р{|б+»—|;>е}— 0 (АА — 0). 

Furthermore, we shall say that the process {у is stochastically strongly 
continuous in the interval (o,A], if for every sequence Ay < à <<< 
A, = A and arbitrary e > 0 

P (ula s l>e) 0 
as тах(№ — Мә) — 0. 

Now Theorem 1 can be formulated in the language of stochastic proc- 
esses as follows: 

In order that the increments (4, — tu of a stochastic process with independent 

increments in the interval № < № € № < А be normally distributed, it is 

necessary and sufficient that the process t be stochastically strongly continu- 
ous in the interval [NA]. 











From the point of view of the developed theory of stochastic processes, 
strong continuity of ( means that with probability one {х as a function of № 
18 continuous at all points X. However, we cannot go into the foundation of 


this assertion here (cf. $ 16 and Ch. VIII of the book [76]). 
kim- all. samp іл Ginn 7 
THEOREM 2. In order that for some suitably chosen constants A, the 


distributions v the sums 


Gs = tenet ee tin, — An 


converge as n — oo to the normal law 
2 


» 


2 


Ф (х) = TE e * dz (5) 


and the summands £y (1 € k € kp) te. infinitesimal, it is necessary and 
sufficient that the conditions 


$ 
1) f 464699, 


Èh. 
kn 

2 ŠA f ara (—( | хаР(ху)|—1 
à 21 < (æj <e 


be satisfied for every є > 0, as n — oo. 


Proof. Sufficiency. 'The condition 1) of the theorem implies that the 
variables £4 are infinitesimal. In fact, for every e > 0 we have, as n — oo, 


S Ё к= а 
eee л. ; 288p f 4646 


SUUS n|zi»: 


<3 [ ао) 0. 


[el >e 
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Therefore we find ourselves under the conditions of the preceding section 
and сап make use of the general results obtained there. 

The necessity of the conditions of the theorem is obtained from Theorem 
4 of § 25, if we put there а = 1, y(r) = 0, and M(—u) = N(u) = 0. 

In fact, the first condition of Theorem + of § 25 can be written as 
follows: For every x > 0 


К kn 
У f ara)! ЎР + Р) 
k=1 jul >а k=l 


> M(—x)+ № (х) = 0. 
It is clear that this condition coincides with 1). 
Furthermore, for every є (0 < є < e, we have 


S| f ағ) (ағ) 


Figs 121 < 
ka 2 
=3{ f ағ, 0) h f xe о) 
kel {арс jade’ 
k 
У fo Sc —( fo хағ о) 
k=1 е cliajce e «|а| < 


—2 *( f x dF yx (х))( f x dF, (х)). 
k=l [ay <6! «1214 є 


Now 


o< $ f 8dFa(xy)—( fO xarQ)] 


k=1 ect rice v gsiTz|<e 


= 
= 


«У f o Edm” | аб) 


k=l et Cial<e k=l e <laice 
kn 
«ew |] ағ, 
k=1 ja | >e! 


and 





2% fo xara ol] fO xar 
К=1 үрсе 


є'<)т|<є 


k 
Pr [ dF ny (х). 

k=l |z|2 c 
This last sum tends to zero as n — oo, by the first condition of the theorem 
already proved. Thus, for every e > 0 and € > 0, 


T Translator’s note. In the original, an equality sign stands here. This is incorrect 
unless z is a continuity point for all Fax (x). 
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[СНАР. § 
== Kn 2 
oe = ЕЖ x? dF yy (x) dh X dF yy (x)) | 
k 
НИИ)! 


{т|<' Io) cel 

i.e., the upper limit does not depend on e. This is also true of the lower 
limit. Hence by the conditions of Theorem 4 of § 25 we conclude that for 
every e > 0 not only the upper and lower limits of the expression 


$| J arn oh f sara 0) 


Fee Izi«e 
exist, but that also the ordinary limit exists. 
For later purposes it will be important to write the conditions for 
convergence to the normal law in some other forms. 


THEOREM 3. In order that for some suitably chosen constants A, the 
distributions of the sums 


On = Ent Snot eee + Ene, Ап 


of independent infinitesimal random rariables converge to the normal 
law (5), it is necessary and sufficient that for every e > 0 


k 

1) * f. аР (а) > 0 (л — œ), 
k=l qup 
kn : 

2) Sf ағ) > 1 (n + оо), 
=! ale 


where Ank = f x dF yy (х), 


|х]<т 
and т is any positive number. 


We shall not give the proof of this theorem, since it is deduced from the 
conditions 1^) and 2’) of $ 25 in the same way as the preceding theorem from 
''heorem 4 of $ 25. 

We shall present one more theorem concerning the convergence of 
distribution laws of sums of independent variables to the normal law. 

The sufficieney of its conditions was indicated as early as in 1926 by 
8. N. Bernstein [5]. The complete theorem was proved by Teller [27] in 
1935. 


Тиковем 4. Zn order that for a given sequence of independent random 
variables 
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й should be possible to find real constants A, and B, > 0 having the 
property that the distribution laws of the sums 


d +... 
„== 1+6 3 + En 224, (6) 


converge to the normal law (5) and the summands 


SOL Us a 


be infinitesimal, it is necessary and sufficient that there exist a sequence 
of constants C,(C, — oo) such that as n — œ 


У [f ао) -0, 


К=1 \=1> Ch 
(7) 
Lou å 2 
c >{ f x ар, (x) —( f xdFy()) | ос. 
nk=t т< б, {т< с„ 
Proof. Necessity. Since by hypothesis the random variables £n: = E 


are infinitesimal, we can make use of Theorem 2. Since F,4(r) = F.(B,x), 
the conditions 1) and 2) now take the form 


1) У f Е, (x) — 0, 


k=l рар еВ 


п 
2) r3 M | x dF, (x) —( f x аР, Gon 1 (n-» со). 
"kal || «tB, {т «B, 
Obviously, we can pick a sequence 


€, > 0, 


n 
so that e,B, — oo and 


> f dF, (х) э 0, 


k=1 |el>e,B, 
n (8) 
] N^ 2 2 
d { x? dF, (x) —( x dF, (x)) l^ 1 
^ k=l |m|cse,B Ixl<e B, 
as n oo, Putting here є.В, = Cn, we obtain (7). 
Sufficiency. Now let (7) be satisfied. 
Put 


в. = 5 { [| x? ағ, (х) —( [ xdF,(x)) \. 


182 NORMAL, POISSON, AND UNITARY DISTRIBUTIONS [СнАР. 5 
Comparing this equation with the second condition of the theorem, we 
conclude that 

С„==о (Ba). 
Hence for every є > 0 and sufficiently large n 


n a 
У f ах» f anos. 
Sl e> Cy Net S eB, 
Thus, by the first condition of the theorem, for every є > 0 
п 
X f anoo (9) 
к= \т|> :B, 
as n — ©. Furthermore, we have 


IV fo eh р sno] 


n ok- Ca SITIS tB, Cp SITI «B, 


<e f dF,(x) —0, n— oo 


k=1121>C, 


z Xl f x dF (x)| f x dF; (x)| 


8 Ca SIZIBA 
n 
«а f 4,0) > 0. 
n k=t|2)>C, 


From this and from the definition of Ba, we see that for every e > 0 


zl f eaeh f x450)]51 00 


"k=l |T| «eB, [211 <В, 


as n — oo, Since (9) implies that the variables £ Bn, (1 € k < n) are 
infinitesimal, we find ourselves under the conditions of Theorem 2; hence 
it follows that (9) and (10), and so also the conditions of the theorem, are 
sufficient for the convergence of the distribution laws of the sums to a 
normal law. 


THEOREM 5.* In order that the distribution laws of the sums 


C, = bni F Eno F ponas F Enr, 


of independent infinitesimal random variables converge to the Poisson law 





29967 
P()—- Y XY apo), 
о<с<к<т — 


* B. V. Gnedenko [36], J. Marcinkiewicz [81]. 
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tt 18 necessary and sufficient that for every € (0 < є < 1) the following con- 
ditions be satisfied: 


kn 
) У Јар, (x) > 0 (п — co), 


К=1 В, 


ky 


2) У f аР, (x) — ^ (n> co), 


k—1|z—11«« 


kn 
3 X f хаР(ху—›0 (n — оо), 
k=1|al<e 


ky 
4) Уу f х#аР„(х)— ( [ x dF qx (ху) |> 0 (n — oo), 
[20| < є 


k=l \т|<є 
where R. denotes the domain obtained from the real line — oo < x < + oo 
by discarding the intervals |x| < e and |r — 1| <. є. 


Proof. This theorem is as easily deduced from Theorem 4 of $ 25 as is 
Theorem 2; to this end it is sufficient to note that in Lévy's formula 
for the Poisson law we should put M(u) = 0, Ni) = -AforO0 <u <1; 
N(u) = 0 foru > 1;ес = 0, у(т) =O for0 < т < 1. 


$27. THe Law or LARGE NUMBERS 
In § 22 the following theorem was obtained [see (10)]. 
THEOREM 1. In order that the sums 
On = a F Eno bees б, — An (1) 


of random vartables which are independent in cach row obey the law of 
large numbers, the condition 


k 
п 
;2 
> f 1 rada Toma) > 0 (n -> оо). 
к=1 


ts necessary and sufficient. The constants A, may be chosen according to 
the formula 


kn 
A, = S {к + f x AF yy (х Mny) \ , 
k=1 |x| <a 


where т is a constant. 
In the remark after Theorem 1 of § 23 it was indicated that we could 


obtain the results concerning the law of large numbers from the general 
theorems. 
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Now we are in a position to do so. Namely, Theorem 1 follows casily 
from Theorem 3 of $ 25, if we take into account that for the unitary law 
we should put. 

10, G(u)=0 
in the formula of Lévy and Khintchinc. 
Remark 1. It is possible to give a somewhat different formulation of 


Theorem 1: In order that the sums (1) obey the law of large numbers, it is 
necessary and sufficient that as п — © 


k 
п 
1) > f аР, (x+ Mak) rz 0, 
ELIT 
M" 
2) M f x? dF A (x + ть) > 0. 
k=1 


|<1 
Moreover, we may take 


A, = SH [ x dF py (хта) + Mnet 


k=1 | |<1 
Remark 2. When it is also required that A, = 0, and consequently the 
question concerns the conditions under which 


P (18. 35: dE l6) 0 (п 09) 


for every « > 0, then the conditions given above are not yet sufficient, 
and it is necessary to add the new requirement that for every 7 > 0 


ky 
>| f Xx OF yy (x + т.) + ma) > 0 (n — оо). 
k=l {т|<т 


Remark 3. From Theorem 1* of § 25 we deduce: In order that the 
„= Ent Td... +- Enk, 


of independent random variables converge in probability to zero as n — oo 
and the variables £j, (1 < k < kn) be infinitesimal, it is necessary and sufficient 
that for every € > © the following relations be satisfied as n — oo : 


k 
1) y f dF лк (x) > 0, 


k=l || > 
kn 
2) У f хаб) 0, 
kell ale 
k 
2 
3) У | [ dF (x) —( f x dF yx (х)) үз 0. 
k=l qm <e jz| 





* Translator's note. Rather, Theorem 4 of $ 25 and the Remark after it, noting 
that y(r) = 0. 
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As a simple particular case of Theorem 1, we state the following theorem, 
proved under the hypothesis В, = п by А. N. Kolmogorov [63] and in 
the general form by W. Feller [28]. 


THEOREM 2. In order that the sequence 


ü, t { 
19 Gore eea Snr.. 


of independent random variables obey the law of large numbers, i.c., that 
for a given sequence B, > 0 there exist constants A, such that for every 
e>0 


Р {| Stet tin aul» ioo (n — co), 


11 18 necessary and sufficient that 


Slats ;d Fk (x 4- my) > 0 (n — со), 
where my, is a median of the variable £. 
The constants A, may be chosen according to the formula 

п 


А == zy (m+ f x dF;, ( (x-+m,)), (2) 


з= lrj< +В» 
where т їз an arbitrary positive number. 


In addition to the theorems given above, we shall prove the theorem 
establishing necessary and sufficient conditions for the validity of the 
law of large numbers in its classical formulation. 


Тнеовем 3.* In order that the sequence of independent random variables 
£o Ё... En... having finite mathematical expectations МЕ, = a, obey 
the law of large numbers, i.e., for every € > 0 


k=) 
п 


¥ а) 
e| 


>e}+o (п — со), (3) 


it ts necessary and suffictent that as n — о 


1) > f dF, (x + ак) > 0, 
k-iizinn 
т 

2) = x dF,(x -l- ay) — 0, 
ki |ri<cn 

3) E dF, (x a4) > 0. 


* See [63]. 


136 NORMAL, POISSON, AND UNITARY DISTRIBUTIONS [СнАР. 5 
Proof. For simplicity, we put 
, 
E, = E — dx 


We shall prove that if (3) is satisfied, i.e., if for every e > 0 


"| 


ГА 
then the variables ы (1 € k < n) are infinitesimal. 


c. 5 


n 


PET (n — co), (4) 








We have for every à > 0 and for sufficiently large n 








e ae И 
py ee ы «i-i (5) 
and 
Erb SE . 
e кешеш ыгы (6) 
Consequently, 


fa} < 2e] 
> P (E |) n (ft 


which is equivalent to saying that the variables 


E 2 
Үз 1<Ё<гп 





Pi 


<} 21—25 





are infinitesimal. 

We can therefore make use of the assertion formulated in Remark 3 to 
Theorem 1. Then we obtain that for (4) it is necessary and sufficient that 
for every e > 0, as n — эо, 


> f dF, (nx) > 0, (7) 
k=l к, 
> |] x dF; (nx) — 0, (8) 
с==1 
lel<e 
= { х? аРь (nx) —( f x dFy (nx)) | > 0. (9) 
=l дш» Jal<e 


Since the mathematical expectations exist, as n — oo 


f ха) э fxdFi(x) — 0. 


imc «en 


$ 27] THE LAW OF LARGE NUMBERS 137 


Hence we conclude that 


у ( f хаРь (nx)) = 1, У | хағ, (х)) > 0 (п => со) 


п 


k=1 |т|< 1|mT|«tn 
and that consequently (9) may he replaced by 
[| х? аР, (пх) 50 (п — оо). (10) 
к=1|рх|< 


Now the equivalence of the first and third conditions of the theorem with 
(7) and (9) is obvious. It remains to prove that 

f n 
2: f хару (пх) — У I | x аРк(х) +0 (п — оо). 
k=1 121<е k=1 Ix]«n 


But this equation follows from (7); in fact, 


Ši f хае» (nx) — py 


k=1 \ш\<1 (wl <e 
is f x dF% (пх) | < 2] dF, (пх) 5 0 (п oo). 
‘kele<|z]<1 k=1leide 


The preceding results permit us to obtain the following interesting 
corollary. 


ConoLLAnv 1. In order that the series 
со 
2 (11) 
k=1 

of independent random variables converge with probability one, it is neces- 


sary and sufficient that the “Cauchy criterion” be satisfied: for every 
€ > (0 and [огт > п 


Р {1,4-5 H m| > ef} 0 (n -— oo)” 


Proof. Necessary and sufficient conditions for the convergence of the 
series (11) with probability one were found by A. Ya. Khintehine and 
А. N. Kolmogorov [63] and consist in the following: l'or every € > 0 the 
following three series should converge: 


У [ хаРк (х), 


k=l 
Ix[<e 


S d f хах) — ( f x dF, (x)) V 


k= lol<e Jrj «t 
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For the convergence of these series it is necessary and sufficient that 
for every т > п 


Ў [ dF,(x) -> 0 (n — оо), 
k=n isl». 

Уз [ xaF,(x) -»0 (n - оо), 
кеп, 

2 f 826.00 —( f хар, (х) | > о (л — oo). 
heu eats EP 


These relations prove our assertion by Remark 3 after Theorem 1. 
This corollary may be formulated in another way. In order that the 


sums 
of independent random variables should converge with probability one, 


it is necessary and sufficient that they converge in probability. 
As another corollary of Theorem 3 we shall state a result obtained by 


А. Ya. Khintchine [51]. 
COROLLARY 2. If the random variables 
JT ta * ^5 oss oe 


are independent, identically distributed, and have a finite mathematical 
expectation МЕ, = a, then the law of large numbers applies to them, i.e., 
for every є > О and n — œ 


n 
1 
Plz 
k=1 
Proof. For the proof it is sufficient to verify that in the case considered 


all the conditions of Theorem 3 are satisfied. To this end we note that, 
by hypothesis, 


[хаР (x+a)=0 (12) 
and 
: fixiaF (x+ a) «eo. (13) 
Э, f dF (х 4-а) =n f dF (x 4- a) < f |x | dF (x -- a), 
k=l |2|>” |z|»n а{р п 


so that by (13) the first condition of Theorem 3 is satisfied. Furthermore, 


Уу | х ағу (x -|- ay) = | x dF (x - a), 


К=1 |т|<тп Jalan 
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so that by (12) the second condition of Theorem 3 is satisfied. Let L > 0 
be arbitrary. For sufficiently large n, we have 


1 1 : 
е. f х%4Ёк (x + а) =— | x? dF (x -- a) 
k=1 Iz|«n \т|<п 
1 © | 
E f x* dF (x +a) + | | x |dF (x+ a). 
EcL L<izI<n 


Whatever the constant L may be, the first term in the last sum converges 
to zero. The second term can be made arbitrarily small by properly choos- 
ing L. Khintchine's theorem is thereby proved. 


§ 28. RELATIVE STABILITY 


The concept of relative stability of the sums of a sequence &, &,..., 
£s... of positive random variables was introduced by A. Ya. Khintchine 
[51]. Namely, the sums 


ka 5 ERR. T5 


of positive random variables А, &,..., £y... are said to be relatively 
stable, if it is possible to find constants В, > 0 such that for every є > 0 


Р 


For the scheme of а double sequence we shall say that the sums 
On = En F En t ec inky (1) 


of positive random variables are relatively stable, if for every « > 0 
P {lEn t int +++ Enr => 2} +0 (a = оо). 


Clearly, relative stability of the sums is the most natural and general 
form of the law of large numbers for positive random variables. Hence * 
we conclude that if the sums (1) are relatively stable, then the separate 
summands are asymptotically constant.t This circumstance permits us { 
to formulate the following theorem: 





|> )-0 (n — оо). 


THEOREM 1.§ In order that the sums 


On = Eni F one Tee “Sak, 


* Translator’s note. From Theorem 1 of $ 27 or 1) below. 

t The summands are assumed to be independent. 

t Translator’s note. The authors probably mean that we need not explicitly 
assume that the variables are asymptotically constant. See below (S) of $ 22. 

$ B. V. Gnedenko [41]. 


110 NORMAL, POISSON, AND UNITARY DISTRIBUTIONS [cuap. 5 


of independent positive random variables be relatively stable, it is necessary 
and sufficient that as n — © 


M" 
А 2 
x à [cn at o mad 0, 


2) у (nm + f x dF, (x + mina) } >i. 
121<1 
Proof. The theorem is an immediate consequence of Theorem 1, $ 27. 
In the following theorems aiming at establishing the connections 
bet ween the conditions for convergence to the normal law and the condi- 
tions for the relative stability of the sums, we shall assume that the sep- 
urate summands £, are infinitesimal. 


Тнковем 2. In order that the sums 
Gn = ini + Eng + Saas F Enr, 


of independent positive random variables be relatively stable and the vari- 
ables Ё, be infinitesimal, it is necessary and sufficient that for every є > 0 


1) y [ағ 0) +0 (п — оо), 


Er 


2) X | гаыа (n — co). 


k=19 


Proof. To prove the theorem it is sufficient to show that under our 
conditions the third relation in Remark 3 after Theorem 1 of § 27 is a 
consequence of the first two. This follows since for every 6 (0 < à < є) we have 


Z а { fe ағ, (х) —(fs аР, (x))'} 
cx є 


A farm <Ñ f x dF y (x) +2 x J Р, (x), 


The following two propositions follow readily from the theorem just proved. 


ConoLLAnv 1. If the random variables £y are independent in cach row 


kn 

and have mathematical expectations with > ME, = 1, then in order that 
k-1 

the sums 


en = Smit in be des 
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be relatively stable and the variables £n; be infinitesimal, il is necessary and 
sufficient thal for cvery e > 0 


Кү œ 
> f хар, (x) > 0 (n > co). 
k=1; 


In particular, we obtain from this 


COROLLARY 2. If the positive independent random variables &, ф,..., 
£j, ... have mathematical expectations, then in order that the sums 


„== a-t eo i En 


be relatively stable for the particular coeficients Bn, 


"n 
B, = > M ok 


and the variables & Ba, A € k < n) be infinitesimal, it is necessary and 
sufficient that for every € > 0 


g Г аР, (х) 9 (поо), 
ёв, J 
EDn 


'THEonEM 3.* In order that for the sequence &, &x,..., En, ... of positive 
independenti random variables the sums 


On = ki -H Ea -H то. + tn 


be relatively stable and the variables & В, A < k X n) be infinitesimal 


for suitably chosen constants B,, il is necessary and sufficient that there 
exist a sequence of positive constants Cy, С... LE such that 
n = 
1) У f аР, (х) +0 (л + оо), 
k=1 Cn 
Cn 
cl 
2) Уе; | хаб) э = (п — co). 
Proof. By Theorem 2 for the relative stability of the sums б = 
£d £c n itis necessary that for every e > 0 
n oo 
1) У: faF,(B,x) +0 (n > оо), 
k=1, 
n Е 
2) У |хаР„(В„х)у—>1_ (п> оо). 
к=1 6 


* A. A. Bobrov [12]. 
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142 
Obviously, it is possible to pick a sequence є, — 0 such that 


n ч nm 99 
3 Јағ,в,)= f аһ) -0 (о), 
7 ên = enBn 
*4Bj, 
хаР, (х) 1 (п — со). 


n п n 1 
У f xar, (В„х) = Y 
k=10 k=1 
In order to complete the proof of the necessity of the conditions of the 
theorem it remains only to put С, = e,Bn. 
Now let the conditions of the theorem be satisfied. Define B, by the 


formula 
n 


c 
fi 
B, -AÀj x dF, (x). 


'Then it follows from the second condition of the theorem that 
C, =0 (Bn). 


Consequently, for every є > 0 and sufficiently large n, 


У [ағ < Dd fare. 
k=l ep, k=1¢, 


Hence according to the first condition of the theorem it follows that 


as n — © 
> f аР, (x) > 0. (2) 


k=1 «By, 


Furthermore, for sufficiently large n, 
n 


eBy Cn *В 
CAE E E sl n 1 . 
Ya] хар, (= Vg. | x dF, (x) 3- т: | x dF, (x) 
k—i 0 k—i 0 k=1 Can 

EBn 
я. 1 
=+% | x ағ, (x). 
К=1 Cp 


But by the first condition of the theorem, as n — oc 


eB, a со 
ye хағ, (х) < У | 4669 = 0. 
k= 


1 On k=1 Cn 
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Thus for every e > 0 
n Bn 
1 N^ 
E Y | x dF, (x) 1 (3) 
k=1 0 
as n— œ. According to the preceding theorem, it follows from (2) and 
(3) that the sums £& + & +-+- + & are relatively stable and the variables 
& /B, are infinitesimal. 
We shall now show the close connection between relative stability and 
convergence to the normal law. 


THEOREM 4.* In order that the distribution laws of the sums 


kn 
Cn = lEn —M Ёк) 


of random variables which are independent in each row and subject to the 
conditions 


kn 
XED'ti-1, sup Р (|, М>) > 0 
k=1 1<К< К 


for every є > 0, converge to the normal law 


2 t? 
Ф (x) = TE f e dt, (4) 


it is necessary and sufficient that the sums 
kn 
nè = XN МЕ) 
k=1 


of the squared deviations of the random variables from their mathematical 
expectations be relatively stable. 


lf the existence of moments is not assumed, then the theorem we are 
interested in can be formulated as follows [87]: 


- 


"HEonEM 5. In order that for some suitably chosen constants A, the 
distribution laws of the sums 


Ea = Eni F Eng F e H 5a, — An 


of independent infinitesimal random variables converge to the normal 
law (4), it is necessary and sufficient that the sums 
kn 


үш <1 


* D. A. Raikov [S9]. 
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of the squared deviations of the £y from their “truncated mathematical 


expectations " f z dF (2) be relatively stable. 
Iz| «1 


Proof. If the distribution function of the random variable £j is Ё, (2), 
then that of the variables 

A c ie [ xdF (x) [m Theorem 4 tnr —£,, M Eny | 

121 <1 
is Fam Р (х f хаР(х)) 
121<1 

{in Theorem 4 Fika) = Ё, (= + Mtn). 

The distribution function of the random variable £% is determined at 
its continuity points by the equation 


HQ) =P lEn <y }=P {| < VI} =F VIF, (У). 


Now it is obvious that 


kn kn f 
D f «0-21 f аР) — Fax(—+)} 
5 ауе 


|212*€ 
Kn А к 
=D f аР (И) — Fal VD =E f dH), 
k=l j e Rely 
Kn kn 
2)€ f хало) У f xai Fn О) Ра (ә) } 
Rel у, 1 oe ce 


kn 
= [ »ан,,0), 
“lLeyee 


En kn 
$ X f dF =D f ydh. 


=! 


prie у> е 


The first two of these equations show that the conditions of Theorem 2 
of this section and those of Theorem 2 of § 26 are either simultancously 
satisfied or simultaneously violated. The last equation shows that the 
same is true of the first corollary to Theorem 2 and the particular ease of 
Theorem 3 of $ 21. The proof of Theorems 4 and 5 is thereby completed. 

It is clear that analogous theorems ean be formulated and proved for 
asymptotically constant variables. 


CHAPTER 6 
LIMIT THEOREMS FOR CUMULATIVE SUMS 


$29. DistrRiBUTIONS OF THE Crass L 


As we have already pointed out, the general statement of the problem 
concerning limit distributions for sums of independent random variables, 
considered in the last two chapters, belongs to the last two decades. In 
the classical investigations only two particular cases of this problem were 
considered, namely, one sought for the conditions which must be imposed 
on a sequence of random variables 


r 


ее КУА (1) 


so that 1) the law of large numbers should hoid, 2) the central limit 
theorem should hold. In 1986 A. Ya. Khintchine formulated for the 
classical scheme of a sequence of mutually independent random variables 
the general problem of determining the class of distributions which can 
appear as limits of the distributions of the sums 


1 n 
G — g- У &— An (2) 
k=1 


for suitably chosen real constants B, > 0 and 4.. 

Just as in the general case, for the solution of this problem it is necessary 

to introduce reasonable restrictions; namely, we assume that the variables 
b= (lok on; n=1,2,...) 
are asymptotically constant. 

It is clear that under this assumption every limit distribution for the 
normalized sums (2) is necessarily infinitely divisible. However, the 
converse is not true: there exist infinitely divisible distributions which 
cannot be the limiting distributions of the sums (2) for any choice of the 
constants B, > 0 and A, and any choice of the sequence (1). This eir- 
cumstance makes it transparently clear why in the classical investigations 
in order to obtain the Poisson law as “the law of rare events” it was 
necessary to have recourse to the scheme of a double sequence, already 
considered above. 

Following Khintchine, we shall say that the distribution function 
F (x) belongs to the class L, if it is possible to find a sequence of independent 
random variables (1) such that for suitably chosen constants B, > 0 and 
A, the distribution functions of the sums (2) converge to F(x), and the 
variables £4 = &/B,, (1 < k € n) are asymptotically constant. 

145 
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We note that without loss of generality we may suppose the variables 
Ёл = EOD. to be infinitesimal in the following. In fact, if the ns are 
asymptotically constant, then putting 
__ Eg o ту 
nk nk nk » 
(Ma and m, are medians corresponding to £4 and &) and 452 А, = 


20 we see that the class of limit distributions of the sums (2) of 


к=1 п 
asymptotically constant summands nx = £x B, coincides with the class of 
limit distributions of the sums (2) of infinitesimal summands 

t^ o Ej — Pl 

"nk Bn C 
P. Lévy [76] gave a complete characterization of the class L in answer to 
a question raised by A. Ya. Khintcehine. 

Before turning to an exposition of P. Lévy’s theorem, we pause to 

prove a lemma (Ixhintchine [59]). 


Lemma. If the distribution function which is the limit of the distribution 
functions of the sums (2) of independent infinitesimal summands Enk = 
&/ D, is proper, then as n — оо 


(a) B,- со, 
Bnri, 1 
(b) B, Ы 


Proof. (a) We suppose that there exists a sequence of indices 
m € nj €: X n; < 5 such that the numbers В, remain bounded. 
Without loss of generality, we may suppose the indices to have been chosen 
so that the B,, converge to some number В # оо as k — oo. Let t be any 
given number, then the numbers & = (B,, converge to (B as k — œ. Dy 
hypothesis, the variables £4, (1 < s < n) are infinitesimal, and so as k — oo 


t 
[P (4) = f (gt > 1 
пк 
uniformly in s (1 € s € nj), i.e., for every t 


А0) = 1 (5 = 1, 2,...). 
It follows readily that for every £, 


f (b — lim. ПИ» (52) ==1, 


But this equation means that F(x) is an improper distribution function. 
We have arrived at a contradiction, proving the first part of the lemma. 

(b) We note that since the summands ё, are infinitesimal, the distribu- 
tion functions of the sums 
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жыш жара — Anyi (3) 


also converge to F(x). If we denote by F,(x) the distribution function of 
the sum (2), then the distribution function of the sum (3) is equal to 


Bari 
n B. Aner Ag: 








F,(B;x 4- A,), where Bi = 
According to Theorem 2 of $ 10 it follows from this that (В B,) —> 1 
as n— о. The lemma is completely proved. 


THEOREM 1.* 7n order that the distribution function F(x) belong to the 
class L, it is necessary and sufficient that for every a (0 < а < 1) F(x) be 
the composition of F'(x/a) and some other distribution function Е. (х). 


Proof. Sufficiency. By hypothesis, for every a (0 < а < 1) 
S (t) = f (at) fa (0), 
where felt) is some characteristic function. We note first of all that a 


function f(t) satisfying this condition never vanishes. In fact, suppose 
for example that f(2a) = 0 and f(t) # 0 for 0 € t < 2a. Then 


1 = 1 — |f (2a)? < 4 (1— |f. (a) ?) (4) 
for every а (0 < о < 1). But since the function f(t) is continuous, as a — 1 


f(a) 
(а) = uy c 


Thus the inequality (4) as « approaches one leads to a contradiction. 
We construct independent random variables & with the characteristic 
functions 
f (Rt) 


ea (85) = ДТ 


The characteristic function of the sum 


is equal to 


=] : k=1 


I sts = П ТЕ e D 


Since the function f() is continuous and never vanishes, it is evident 
that 


ien (22) 1 (п —> оо). 


* P. Lévy [76]. 
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uniformly in k (1 < k € n). Thus it is proved that F(x) belongs to the 
class L. 

Necessity. Now suppose that the sequence of independent random 
variables 


ETE. 
19 595° ey бр 6 


is such that for some suitably chosen constants B, > О and 4, the distribu- 
tion functions of the sums 


n 
v Iy,. 
= р, бе Ак 
п. 
k=1 


converge to a limit distribution function F(x) and the variables £4 = 
&./B, (1 < k € n) are infinitesimal. If F(x) is an improper distribution 
funetion then the condition of the theorem is trivially verified and the 
distribution function F4(x) will also be improper. Only the ease that F(x) 
is a proper distribution function requires a proof. 

Expressing our hypothesis in terms of characteristic functions, we 
obtain: as n — oo 


n 


fey) = 7 TT ASDO, (5) 


k=1 


tk (ж) = 1 uniformly in k (1 € k <n), (О 1, (O = Ofor any t (being 


а characteristic function of an infinitely divisible law). 

According to the preceding lemma, for every given a (0 € a < 1) it is 
possible to pick m = m(n) (m < n) so that as n о 
B 
B, — а. (6) 
We write f(t) in the following form: 


i-a o 


k=1 


But bv (5), as m — оо 


m 
А mt f t 
e ~ Al) о; 
k=1 т 
hence, because of (6), 


om Th (4-) 97 G0. (8) 


kl 
The relations (5) and (8), together with Theorem 2 of $ 13, permit us to 
conelude that the second factor on the right side of (7) must approach 
some characterise function fall). 
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Thus we find in the limit that for every а (0 <a < 1) 

T(t) = f (2h fa (2). 
Q.E.D. 

For later purposes it is important to note that feli is the characteristic 
function of an infinitely divisible distribution. Indeed, f4(/) is the limit of 
a sequence of characteristic functions of sums of independent and asymp- 
totically constant random variables. 


$30. Caxoxicau REPRESENTATION OF DISTRIBUTIONS OF тик CLass L 


Each distribution of the class L is infinitely divisible; hence the loga- 
rithm of its characteristic function can be represented by Lévy’s formula 


itu 


gs pent 2E (ее i aw 


+ ( |н —1— hd). 0) 


The question naturally arises as to what special properties the functions 
M (u) and N (u) must possess in order that f(0) be the characteristic function 
of a distribution of the class L. To this question a complete answer is given 
by the following theorem, discovered by P. Lévy [76]. 


''uEonEM I. /n order that the distribution function F(x) belong to the class L 
ai is necessary and sufficient that the functions M (wu) and Хп) in the 
formula (1) have right and left derivatives for every value u and that the 
functions 


uM’ (и) (u< 0), 
uN’ (u) (u > 0) 


be nonincreasing [here М'(и) and N’ Qu) denote either the right or the left 
derivative, possibly different ones at different points]. 


Proof. Let ГО) be the characteristic function of a distribution belonging 
to the class L, and a an arbitrary number between 0 and 1. We find from (1) 





that 0 
log f (at) = iyat — T + | ение 1— ham (ш) 
—со 


5?[{?а? 


че 
TE Јами) = 75 
0 





0 


+ | adit eed [a “oh ам (2. f ja (7). 


-- со 


— 
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0 осо 
а І и3(1— а?) _03(1— 92) —— 
р | (1 + u?) (1 + a*u*?) aM (u) 4-2 | (1+ u$) (1 F ои) aN(u). 





Now 
| 0 
log HO = и — 0759 p [| { \a(m u)—m(4)) 
+] { ein — Na (Ми) —N(4)). (2) 


By the results of the preceding theorem, f(t)/f(at) is an infinitely 
divisible characteristic function; (2) gives its canonical representation. 
Hence we conclude that the functions M (u) — M(u/a) and N(u) — N (u'a) 
must be nondecreasing. t Therefore, whatever ш < us < 0 and 0 < v < 2 
are, we must have 


M (u) — M (7) <M (u) — (32), 
N(v,) —N e < N (va) — N(2), 


a 
and so 


) 
M (=)—m (=) < М (ug) — M (u,), 


м(-22) = (2) < N (v) — N (0). 


Conversely, if the functions M (u) and N (u) satisfy the inequalities (8) for 
every a (0 «a < 1), then the functions M (u) — M(u/a) and N(u) — 
N(u/a) will be nondecreasing, and consequently f(t) ‘f(at) will be the 
characteristic function of an infinitely divisible law. Therefore the condi- 
tions (8) are necessary and sufficient for the distribution to belong to the 
class L. 

We confine our further discussion to the function .V(u), since it is 
possible to obtain similar results for M (u) by the same arguments. 

Let a < b, h > 0 and a = et. By (3) we have 


(3) 





N (ec) — N (e) > N ( = ) (22) = veer) — ме). 


If we denote 
N (е) = 5 (v), 


then it follows from the preceding inequality that the nondecreasing 
function S(v) satisfies the inequality 


S(a-- 8) — S(a) 2 5 (9-0) — 5 (0) (a« 0). (4) 





Ti anslatoi 's note. For this cone Jusion we need the C orollary to Theorem 1 of 
$ 18 
\ a 
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Thus, the increment of S(v) in intervals of given length А ean only decrease, 
when the interval shifts from left to right. 

This eireumstanee permits us to conclude first of all that the function 
S(v) is continuous at every point v = t. 

We put now b = a +h. We find from (4) that 


S(athy> Stay Su 2h 


, 
i.c., the continuous function S(r) is concave.* Therefore it has finite left 
and right derivatives at every point ; the value of the right derivative never 
exeeeds that of the left derivative; and both derivatives are nonincreasing 
as x increases (see Tardy [48], p. 91). 


Thus S'(v) is a noninereasing function [S'(v) does not necessarily remain 
the right or the left derivative throughout]. But 


S(v) = №(ет),  S'(v)oewN'(ev), 
hence, returning to the notation u = e", we find that the conditions of the 
theorem are necessary. 


Now suppose that uM'(u) and wV'(u) are nonincreasing functions. 
Then, for every a (0 < а < 1), 


M(2)>uM’ (u) Тога <0, 


ala Riz 


N’ (2) « uN’ (и) for u > 0. 


Hence if v; < v; < 0, then 


vy о, о, 


| am(2)= Г м (2) dv < [| M' (о) ао = f ane, 


Un Vi vi 


and if 0 < wu, < us, thon 


Гам") = ju N' (=) du< [wv (и) du = | IN 


uy Uy Ui 


These inequalities, as we have seen above, prove that the distribution 
F(z) belongs to the class L. 

The simplest example of a distribution belonging to the class / is the 
normal distribution. Then the functions Л (и) and iN (u) are identically 
equal to zero. 


” 


* Translator’s note. In the original the word “convex” was written instead of 
“concave”: further on, the words “right” and "left" were interchanged and “in- 
crease" was written instead of “do not increase.” 
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We now determine what conditions must be imposed on the function 
К(и) in Kolmogorov's formula in order that a distribution with finite 
variance belong to the class L. Since the functions .M(u) and .V(u) are 
related to Күн) by the formulas 


1 
u? 


dN()—--dK() — for u>0, 





dM (и) = ғ dK (и) for u <0, 


the following result [46] is readily obtained from Theorem 1. 


Тнеовем 2. Zn order that the distribution function F(x) with finite variance 
belong to the class L, il is necessary and sufficient that the function K (1) 
іп Kolmogoror's formula have right and left derivatives at every. point 
u = 0 and that the function К”) wu, where K'(w) denotes the right or left 
derivative, possibly different ones at different points, be noninereasing 
for u < 0 and н> 0. 


Tt is easily verified that distributions with finite variances belonging 
to the class L are not exhausted by the normal and the improper distribu- 
tions. For example, the distribution for which 


0 for u <0, 
K(u) =4 u? for OS uc, 
I for «> 1, 


in Kolmogorov's formula, satisfies the conditions of Theorem 2 and so be- 
longs to the class L. 


$31. CONDITIONS кон CONVERGENCE 


In this section we shall derive conditions which it is necessary to impose 
on the distribution functions of independent random variables £i. £s, 
£j... in order that for suitably chosen constants В, > 0 and A, the 
distribution functions of the sums 


154-84... +8 
„= a, (1) 


converge to some limit distribution funetion and the summands £,; = n 
n 
(1 € k < n) be asymptotieally constant.* 


* Throughout this section the ease of an improper limit distribution is excluded. 
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If the constants B, arc given in advance, then the solution of the problem 
posed above is already contained in the theorems of $ 25. The essential 
difference between the present problem and the general problem solved 
before is that there 1s now indicated a general rule by which the constants 
В, should be chosen. 

If the variables £, have finite variances and if conditions are sought for 
the convergence of the distribution functions and the variances of the 
sums (1) to a limit distribution function and its variance, then the question 
receives a most simple answer. 


THEOREM 1.* Jn order that for suitably chosen constants A, and В, > 0 
the distribution functions of the sums (1) converge to a limit, their variances 


converge to the vartance of the limit, and the summands E (& — Mt) be 


infinitesimal, it is necessary and sufficicnt that there exist a nondecreasing 
function Ki(u) with variation equal lo one such that for all u = 0 


Cu 
d V | x? dF, (x+ M&) > К, (и) (2) 


asn— о, and moreover that 


sup | LÁ dF, (e+ МЫ) > 0, (3) 
1<к<п © Cy tx 
where 
^ 
1 2 
= ` D'£,. 
K—1 
The constants A, and B, may be chosen according to the formulas 
п 
п à 1 \ : 
B= > D'&, An = Fe а M 
k=l n k= 


The limit distribution is defined by Kolmogorov’s formula with the constant 
y = 0 and the function К(и) = Kilu). 


Proof. Let u = (ф&— МЕ) Ca; then Fule) = F(C + Mg), and 
by (4) of § 20 the condition 





sup | к> аР, (x М) = sup |= лз ы. 


leen v +x? xkan 1+2 


simply means that the variables & (1 < f € n) are infinitesimal. 


* B. V. Gnedenko and A. V. Groshev [26]. 
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Suppose that the conditions of the theorem are satisfied. Then the choice 
of C, implies that as n — о 


и 


У f 924640) К, (а). 
kzl Es 


According to Theorem 2 of $ 21 the distribution functions of the sums 


E+ ...4-§ 
pois и 


converge to a limit which is given by putting K(u) = Ki(u) in Kolmogorov's 
formula. If we take y = 0 in Kolmogorov's formula for the limit distribu- 
tion then A, must be chosen as indicated in the statement of the theorem. 

We shall now prove the necessity of the conditions of the theorem. 
Suppose that for some A, and B, > 0 the distribution functions of the 
sums (1) and their variances converge to a limit distribution function and 
its variance and the summands (& — M£J/B, are infinitesimal. According 
to Theorem 2 of § 21, under these conditions there exists a nondecreasing 
function A(z) such that at all continuity points of K (u), hence by Theorem 
2 of § 30 at all u ¥ 0 and also for u = œ, 


У [ ха, (Bax + Mb) KW) (п +09). (4) 
k=1 о 


If V > 0 is the variation of K(u), then we put v = uv and 


1 , 
K, (v) = Yr К (и). 
By (4), for all v ғ 0 


> {4 ағ, (В, ма) = Y, | tar (8.2 УМЫ) > K, () 
k=1 =% Y= 1 — 


and, in particular, for v = + c 


n n 
уш M | e Mta co =g X0 s (о). 

k=1 k=1 

By choosing new B, according to the formula 
n 
VB? = 2 О, 

we do not change the limit law for the sums (1), as follows from Theorem 2 
of § 10. Q.E.D. 

We now turn to the consideration of the general case. Suppose that the 
random variables & апа & have the same distribution function F(x} and 
that & is independent of £ for all l. The distribution function of the differ- 
ence ль = & — & is denoted by Vi(r). Obviously, Vi(r) is a symmetrical 
distribution function and its characteristic function v(0) is AOP. Now 
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suppose that for suitably chosen constants B, > 0 and A, the distribution 
functions of the sums 


Qe шы а, ® 


converge to a limit distribution function F(x) as n — oo. Then it is obvious 
that the distribution functions of the sums 


mtt -Hn & he + 
ТТЕ, =( B, — Ay) 
ae DE is A 
Ес + 2А) (6) 


converge to the function V(x) = F(x) Ж [1 — F(— x + 0)], the characteristic 
function of which is equal to e(t) = |S. 


Тнкон®м 2.| 7n order that for suitably chosen constants В, > 0 and An 
the distribution functions of the sums (5) converge to a proper distribution 
function F(x) and the summands £x Ba (1 < k < n) be infinitesimal, it is 
necessary and sufficient that there exist a nondecreasing function G*(u) 
[С*(— оо) = 0] with finite variation V = G*(+00) such that, if we define 
C, > 0 by the equation 


n 

у ; dV, (х) = 2V, (7) 
von latis aia x 

then we have 

(о) for every u (—со € u < +оо) except possibly u = 0 


т 


Y | аана) 0t) i по, 
x 
со n 


k=1- 


(8) su | ye x)—0 as mco. 
ра Сё 4x x( ) 


Proof. The condition (8) means simply that the summands & ‘Ca, 
(1 € k € n) are infinitesimal. 

Since nr = &/B, and Fale) = F,(B,x), it follows from Theorem 1 
of $ 25 that a necessary and sufficient condition for the convergence of the 
distributions of the sums (5) to a limit is the following: 


t В. V. Gnedenko and A. V. Groshev [46]. 
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lim Brae IF d au) = 0* (и) (8) 


NET 


The sufficiency of (о) is now obvious. In fact, it means that (8) is 
satisfied with B, = Cy. 

Now suppose that the distribution functions of the sums (5) converge 
to F(x) for some choice of the constants B,. If we prove that (B, C,) 1 
as n — ©, then by Theorem 2 of $ 10 the distribution functions of the 
sums 


also converge to F(x), and so (о) is satisfied in view of (8). 

'The distribution functions of the sums (6), as already stated, converge 
to V(r). The function G(u) in the formula of Lévy and Khintchine for 
V(x) is 

а (и) = G* (u) + V — G* (— и), 


The median of the random variable £ is equal to zero. Therefore by 
"'heorem 3 of $ 25 for the convergence of the distribution functions of the 
sums (6) to V(x) it is necessary that for ~œ% < u € +æ, except possibly 
for и = 0, 


п В.и 
b | pem dV, (x) — G (и) as n> cO. (9) 
k=1 — со a 


From (7) and (9) (for u = 4-2) we conclude that as n — oc 
R= У 1) mos dV, (x) — | ama en] e (10) 


By. hypothesis, F(x) is a proper distribution function. This means that 
G*(u) #0 and consequently С(и) = 0. Therefore it is possible to find 
a > 0 such that 
A= G (а) —G(—a) р 0. 
From (9) we conclude that for n > no 
РЯ Bt 
y Jem 
Z ; dV. (д) л> 


ү А x. 
к=1 —B,a n; 


ic 


For n > no we have the following chain of obvious inequalities: 
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-l 


2 2, WM x? 
Ra Od | acere ev n 





з-сы m f 2 1022) –. 
п пі WX С» А 
n kal Ва" а 4] 


By (10) it follows from this that (B,/C,) — 1 as n > o. Q.E.D. 


$32.* UxiwopALITY oF DISTRIBUTIONS ок типы Crass L 


In mathematieal statistics a distribution function is called unimodal 
if its derivative (х) exists everywhere and has a unique (finite) maxi- 
mum. Following A. Ya. Khintchine, we generalize this concept somewhat. 


DEFINITION. The distribution function F(x) is called unimodal if there 
exists at least one value т = a such that F(x) is convex for x < a and 
concave for х > a. 


It is easy to verify that the normal distribution, the Cauchy law F(x) 
= 1 + (1/r) arctan х), the uniform distribution in a finite interval, and the 
improper distribution e(r) are all unimodal in the sense just described. 

Without loss of generality, we shall suppose in the following theorems 
that the vertex of the distribution (the point a) is located at the point 
а = 0. It follows from the definition cited above that a unimodal distribu- 
tion function has right and left derivatives at every point except possibly 
the vertex. Moreover, F'(z) [under F'(x) we mean either one of the deriv- 
atives — right or left — possibly different ones at different points] does not 
decrease for x < a and does not increase for x > a. This fact is well known 
in the theory of convex functions (see Hardy [48], p. 91). 


THEOREM 1.f Jn order that the distribution function F(x) be unimodal 
(with vertex at x = 0), it ts necessary and sufficient that the function 


V (x) = F (x) — xF' (x) 


be a distribution function.§ 


Proof. First we make a few elementary remarks. 


* Translator's note. § 32 has been shortened; see Appendix II. 

1 A. Ya. Khintchine [60]. 

§ Translator's note. See Appendix II for a discussion of the theorem. Part of 
the result was apparently rediscovered by N. L. Johnson and C. A. Rogers, The 
moment problem for unimodal distributions, Annals of Mathematical Statistics, 
22, 433-439 (1951). 
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1°. If F(x) is a unimodal distribution function, then for x — +0, 
х-э —0, =2- +оо, and = > —o, 
xF' (x) — 0. (1) 


Indeed, for x > 0 [F'(x) does not increase!] 


X 
xF' (x) 2 [ F'(u) du. 
x 


Y 
Hence 
0 « xF' (x) «(1 —F(35)) +0 
as r — +œ and 
0 € xF' (x) <2 (F(x) —F(--0)— 0 
as z— +0. In exactly the same way (1) is proved for х — —0 and 


х э —o. 
2». If V(x) is a distribution function, then as х — +0 or z — +00 





x [| 2. o. (2а) 
2 
and as z > —0 ог х > —oo 
т 
х [eve — 0. (2b) 
— со 


In fact, as z > +0 
co Ve 


dV " dV "av 
E f ш, | HH) + f ec) 
T х Va 


< [ave Vs f avs viv eVo 
т Yr 


As х — +оо 





и 


* | 4У ш) c1. v(x) 5 0. 


In exactly the same way (2b) is proved. 
Now let F(x) be a unimodal distribution function. We shall prove that 


V (x) = F (x) — xF' (x) (3) 
is a distribution function. Indeed, for h > 0 we have 
V (e+ 5) — V (x) = EF (x + 8) — F(x) — AF’ (x 4- A) 
+ x (F(x) — F(x + һу] = F(x h) Е (x) — AF’ (х) 
+ Gc 4-8) LP" (x) — Р A). 
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For x > 0 in the first of these equations, and for x « «+h <0 in 
the second, both summands on the right side are non-negative. IIenee 
V(z+ h) – V(x) > 0 for every x #0. Ву (1) we have 


V(-+0)=F(+0), V(— 0) =F(—9), 
И(4- со) = Р( оо) = 1, (оо) = (оо) = 0. 

We shall now prove the converse proposition. Namely, we shall prove 
that if V(x) is an arbitrary distribution and the distribution function F(x) 
satisfies equation (3), then F(z) is unimodal. First of all, it is easy to 
convince ourselves that among the solutions of equation (3) there can be 
only one distribution function. Indeed, if there were two, 'i(z) апа F(x), 
then their difference ТУ (х) would satisfy the equation 


W (x) — xW (x) 0 
and so * 

Fy (x) — Fa (x) = cx 
for x 7 0, where cis a constant. From the conditions Fi(—90) = Fs(—90) = 0 
and the conditions F)(+00) = F.(+0) = 1 we conclude that c = 0. It 


is not difficult. to convince ourselves by some elementary calculations that 
the function F(x) defined by means of the equations 


a u 
F (x)= — | au [2m for к< 0, 

Ө ев (4) 
F(xy=1— | du | 4*9 iora 0 


satisfies (3).1 Moreover, it obviously does not decrease for x < 0 and z > 0; 
furthermore, by (2a) 


F(+0)=1— | du | XO 
+0 u = ES 5 
=1—u | ZO | 5 | avo= v CF 
и 4 +0 £0 


and in exactly the same way 
F(—0)= V(—9). 


In other words, F(z) is a distribution function. By differentiating (+) we 
find that Р(х) does not decrease for x < 0 and does not increase for 
x > 0. This proves that the function F(x) is unimodal. 


* Translator’s note. See Appendix II for a reference to the required theorem. 
t Translator's note. In verifying (3) the denumerable set of points of discon- 
tinuity of V(z) may be ignored; see Appendix IL. 
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The theorem just proved can be formulated in terms of charactcristic 
functions as follows: 


THEOREM 2.* The function ЈО) is the characteristic function of a unimodal 
distribution function uf and only if it can be represented in the form 
t 


Р) =4 | v (u) du, 


where v(u) is some characteristic function. 


Proof. We have by (4) 


fm. | ағ) 








е акно ғо) СЕ 
Ви 

а) а fenem Fr etai, 
4'o z +о fo % 

F ( +0)—F(—0)=(4 [ оа). (V(-+ 0) — V(— 0)). 
Thus : 
t t 
f + | { f e du dV (x) =} foa. 


э ТиконкЕм 3. The composition of unimodal distribution functions is 
unimodal.t 


Тнвокем 4.1 /f a sequence of unimodal distribution functions converges 
lo a distribution function, then the limit function is also unimodal. 


Proof. By hypothesis, as n — oo 
Р, (x) => F (x). 


Let a, be the vertex of F(x) and a = lim an. Pick a subsequence of the 
non 
indices n; so that lim an, = а. Now take any two continuity points of 
k= 
* A. Ya. Khintchine [60]. 


t Translator’s note. For a discussion of this incorrect statement, sce Appendix II. 
t A. L Lapin [71]. 
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F(x), zı <a and z: < a, and determine N so that for all k > N the 
inequalities x; < an, and х» < а,, hold. By the hypothesis of unimodality 
of the functions (х), we have for k > N 


+L Xo 
Fay Q5) F Fa (2) > 2F,, (=). 
This relation becomes, in the limit, 
F (x) БРО) > 2r (EFA), 


We have supposed z, and x to be continuity points of F(x); however, 
it is obvious that we can easily drop this restriction by making use of the 
continuity of F(z) from the left. 

In exactly the same way, for any 2з > a and т, > a we find that 


F (x5) + FG) « 2F (8-5). 





Thus, F(x) is concave for x > a and convex for x < a; namely, F(x) is 
unimodal. 

Finally we remark that а # +œ, since otherwise F(x) would be convex 
or concave for all values of the argument. For functions of bounded varia- 
tion this is possible only if F(x) is a constant. 


Part III IDENTICALLY DISTRIBUTED 
SUMMANDS 


CHAPTER 7 


$33. STATEMENT OF THE PROBLEM, STABLE Laws 


We shall now turn to the detailed study of limit distributions of nor- 
malized sums 


„ t Е 
Sa = CERT th — An (1) 
of independent, identically distributed * random variables Ё, &,..., £y... 


By Theorem +4 of § 14, if the distributions of the sums (1) converge to 
a limit, then the variables £4 = (&/DB,) — (А„/п) must necessarily be 
infinitesimal. This cireumstance permits us to conclude that every distribu- 
tion which is a limit distribution of the sums (1) must belong to the class L. 
The problem naturally arises of determining all possible limit distributions 
of the sums (1). The solution of this problem requires the introduction of a 
new concept. 


Derinition. The distribution function F(z) is called stable if to every 
a, > 0, bi, аз > 0, 0 there correspond constants a > 0 and b such that 
the equation 


F (ax + b,) x F (ах + б„) = F (ax + b) (2) 
holds. 


It is easy to verify that the normal and the improper laws are stable. 

Obviously, within one type either all laws are stable or none is stable. 
Therefore it is possible to speak of stable types of laws.T 

The importance of the class of stable laws for our problem is determined 
by the following theorem (see [78] and [59]). 


THEOREM. In order that the distribution function F(x) be a limit distribu- 
tion for sums (1) of independent and identically distributed summands, 
it is necessary and sufficient that it be stable. 


Proof. We suppose first that the distribution functions of the sums {„ 
converge to a certain distribution function F(z). 


* Le., such that for every т 
PiL xSP {inay nn = PLE, <x} =... = F(x). 


{ The definition of a stable type can be formulated more briefly as follows: а 
type vs stable if it contains all the compositions of the laws belonging to it. 
162 
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According to the Lemma of $ 29, if the function F(x) is proper, then 
as n — 00 


1) В, > со, 
2) Вън 1, 
n 


Now let a; and a» be any two constants (0 < a, <a < =): then for 
every e > 0 and n > ngle) it is possible to pick an index m = m(n) so that 


Further, let bj and b; be arbitrary real constants. 
Consider the sum 


B, (E Ё ы Е В © oo) Spam 
(“+ ct i a, — b,) + A (en T- Ent 


a B m 


EE RE E E 
— Ag — ba) = iR them A, (3) 


where 


Bua Bn д._ Brân + BmAm + Bn + bBm 
= == Un n m RR. 


Since by hypothesis the distribution functions of the sums (1) converge 
to F(x) as n — ©, by Theorem 2 of § 10 the distribution functions of the 
first and second summands on the left side of (3) converge respectively 
to F(ai'z + bi) and F(az'x+be). It follows that the distribution functions 
of the sums on the right side of (3) must converge to a limit distribution 
function. On the one hand, this limit must be 


F(az'z + b) x F(asiz + 55), 


on the other hand, it must be of the same type as F(x).* 

In other words, we have proved that if F(x) is a proper law, then it 
satisfies equation (2) and so is stable. 

Since all improper laws are stable, the theorem is proved in one direction. 
The converse proposition, that every stable law V(x) is the limit of dis- 
tribution functions of the sums (1) of identically distributed summands, 
is readily deduced from the definition of a stable law. Indeed, let the 
variables & (k= 1, 2, ...) be independent and distributed according 
to the law F(x); then the sum £& + +, is distributed according 
to the law F(a,x + b,), and so the variable 

65d. b 
an Qn 
is distributed according to the law F(z). Q.E.D. 

Among the great number of results in this section we shall especially 
emphasize the following. 

The theorem in § 34 gives an explicit form for the characteristic func- 
tion of a stable distribution. Among infinitely divisible laws, and even 





* Translator’s note. By Theorem 1 of § 10. 
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among laws of the class L, the stable laws occupy a modest place in extent: 
the set of stable types depends on two real parameters a (0 < а < 2) and 
8(—1 € 8 € 1), while the set of infinitely divisible types depends on the 
choice of a monotone function. We remark that while the theorem just 
mentioned solves completely the problem of determining all stable char- 
acteristic functions, explicit expressions for the stable distribution functions 
are known only in a small number of cases. 

In $35 is given a complete solution of the fundamental problem in the 
theory of limit distributions for identically distributed summands: the 
necessary and sufficient conditions are indicated which must be imposed 
on the function F(x) in order that the distributions of the sums (1) con- 
verge to a limit. 

Finally, in $ 37 is proved a theorem of A. Ya. Khintchine which states 
that if we consider the convergence of the distribution functions for the 
sums (1) as n becomes infinite, not through all possible values but only 
through some subsequence of them, then the elass of possible limit distribu- 
tions is thereby essentially widened and turns out to coincide with the 
class of infinitely divisible laws. 


§ 34. CANONICAL REPRESENTATION OF STABLE Laws 


TueoreM.* 7n order that the distribution function F(x) be stable, it is 
necessary and sufficient. that the logarithm of its characteristic function 
be represented by the formula 


log f (£) = it — e| tle {1 тет w(t, 2), (1) 


where a. В, y, e are constants (y is any real number, -1 < 8 <1,0<a< 2, 
c > 0) and 


tg zo, if a1, 
о (f, а) = 
Jog|t|, if a=. 


The functions M (w) and N (u) and the constant o in P. Lévy's formula are, 
correspondingly: 


ере] 


€ 22:0, Co > 0, 41-060 





* А. Ya. Khintehine and P. Lévy [62]. 
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Proof. In terms of characteristic functions, equation (2) of § 33 can 
be rewritten as 


log f( 7) = tog f (2-) + log f (2) + ip, (2) 


where 8 = b — b, — b». We recall that F(x) is an infinitely divisible law 
and, consequently, 





0 
З a tg it 
log fit) = 5 f [emo es Jano) 





oe | festu — 1—4 AE | avo. 


0 
Elementary calculations show that 


0 
02/2 Е itu 
паа | (em —1 та) t m 


—o00 
+œ 
itu 


T f |=—1 = ids } амбал) 
0 
0 
; ou? ; itu 
== {оі ~ Oat -+ | [gines I E g |n (аш) 


f en SES s | aN (au) 
0 





0 
D Я if 
+ itt — Sa + | {et —1— Rn (аш) 
2 


fimt ты } AN (au). 


Hence, because of the uniqueness of the representation of an infinitely 
divisible law by Lévy’s formula, we conclude that. 


(а) 9 (3) 


а? a? a? 
M (аи) = M (a,u) М (аш) (и <0), (4) 


N (au) = N (a,u) -{- N (agu) (u > 0). (5) 
Now suppose that N(u) 4 0; then in (5) a cannot vanish. Otherwise, 
indeed, for every u the equation 
N (auo) +N (ag) = N( +0) 
would hold, which is possible only if either У(нҥ) = 0 or NOD = = ә. 
The second ease is impossible for the function N (u). 
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For the equations obtained above we are going to determine all the 
continuous nondecreasing solutions M (и) and N(u) satisfying the condi- 
tions M(— оо) = N(+0) = 0 and not identically equal to zero. For this 
purpose we remark that it is sufficient to confine ourselves to determining 
either one of these two functions, say N(u). From (5) we conclude by 
induction that whatever the natural number n and the positive numbers а, 


05,..., а, may be, the following equation holds: 
N (au) = N (a.u) + N (agu) +... + N (anii) (и > 0), 
where a is some positive number, depending on а, ds, ... , а». 
Hence, in particular, for a, = а = ++: =a, = 1, we find that 


N (au) = nN (u), 


where a = a(n). From the last equation, putting a (1/n) = 1/a(n), we find 


that 
N(a (+) и) =} N (u). 


Finally, for an arbitrary rational number pq we find that 


2 = A. = 1 =N Р. 

7 N (u) = pN(a ( 4 Ja) =N(a(z)a (p) и) N (a (7 Ja). (6) 
It is easy to see that the function a(p q) defined on the set of rational 

numbers is decreasing. Hence, for every А > 0, the limits 


а(х — 0) = lim a(2), 


Boe 
4 

а(%--0у== lim a(2) 
E ato 


exist. From (6) it is not difficult to deduce that for every А > 0, a(A — 0) 
= а(Х +0) = a(d). Thus, for every positive А, the function N (u) satisfies 
the equation 


AN (u) =N (a (A) и), (7) 


where a(A) > 0 is a decreasing continuous function of А. 

Equation (7) permits us to obtain the following important result: 
the function N (u) is either different from zero everywhere or identically equal 
to zero. 

Indeed, supposing the contrary, that for some ш > 0 the equation 
Х(ш) = 0 holds and that IN (wu) = 0 for some u, > 0; we shall reach a 
contradiction. First of all, from the fact that М(и) is nondecreasing and 
N (9-9) = 0, we conclude that N (u) = 0 for u > u and consequently wi < ш. 
Let мз (uj < ue < uo) be the supremum of those u for which NGO z 0. We 
know that the function V (u) is continuous, hence N (u2) =0. According to (7), 


AN (u,) == М (аи). 


§ 34] CANONICAL REPRESENTATION OF STABLE LAWS 167 


Let А > 1. Since a(1) = 1, we must have 0 < a(A) < 1. But it follows 
from (7) that 


0 = АМ (ug) = N (aus). 
Since au; < и, we must have N(aw;) < 0. We have reached a contra- 
diction. 
We shall now determine the form of the function N (u), supposing that 
N(u) = 0. 
By Theorem 1 of $30 the function N(u) has derivatives (both right 
and left ones) for every u; hence we find from (7) that 


dN(u) _ dN(au) 














À du 4 (аи) ` 
Consequently, , aN (au) 
N'(u) |  d(au) 
N(u) | N(au) 
1 : : , AN'( 
We put u = 1 in this equation and write. NU) ~ —a. As a result, we 
arrive at the following equation for the function N(u): * 
ам (а) — eld da 
N(a) a` 
Hence 
N (a) = — C7", (8) 


where c» is a constant (cs > 0). 
Since N (-- оо) = 0, we conclude first that a must be a positive number. 
Furthermore, the integral 


1 1 
f Panu) = са f шаи 
ò 0 


must converge, so0 < а < 2. 
In exactly the same way, we find that 


M (u) =<, (9) 


[а |“: 





where c > 0,0 <a < 2. 
The equations (8) and (9) together with (4) and (5) yield, for а, = a» = 1, 
1 


1 
a* а?! 


=2. (10) 


* Here it is necessary to note the following fact: from (7) and the fact that 
М (и) =Æ 0 it is easy to deduce that when А varies in the interval (0, 2), а(А) takes 
all values between zero and infinity. 
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Hence we conclude that а = a. From (3), with а = аз = 1, we find that 
1 \, 
с? (as — 2) = 0. 


If the function M(u) [or N(u)] does not vanish, then by (10), 1/a? = 2 
and so we must have с = 0. If, however, с # 0, then 1/a? = 2, and so we 
must have c; = е = 0. 

Collecting the results obtained, we see that the logarithm of the char- 
acteristic function of a stable law is either 


юк 7(1)= 0—52 (s 0), (11) 


which yields the normal law, or 
0 


юк = Fl jue 


— со 


where с > 0, 2 0, 0 <о < 2. 

Elementary verification shows that the formulas (11) and (12) indeed 
give stable laws. Thus the problem concerning the canonical representation 
of stable laws is completely solved. The integrals in (12) can be expressed 
by elementary functions; we now proceed to the derivations of these 
expressions. We have to consider three cases. 

1. 0 <a < 1. Since in this case the integrals 


со 


0 

y u du i и аи 

| а га and | Dai wies 
0 


—со 


are finite, (12) сап be written as 


0 
VUA D du ; du 
log f (f) = FA с, | (et — Dres + | (et — 1) aie 


0 


We suppose first that ¢ > 0, then 


Y NA d ^ d 
log f (0) = 614-8 [a] (e7* —1) ds + | e — 0 au]. @3) 


0 0 


We make use of Cauchy's theorem on contour integration, taking as the 
contour of integration the segment of the real axis from 0 to R, the circular 
arc of radius R with center at the origin, and the segment of the imaginary 
axis from О to iR. Letting R tend to infinity, we arrive at the equation 
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ето Aw. | (eiv—1) delit | (e-¥—1) UT 


0 


s 


where 
d 
L (о) = f (exi ee ae 0 


Since the first integral in (13) is the conjugate complex of the second 
integral, it follows that 


[с=ш= Те. 
Thus, for t > 0, 
log f(t) = ity’ + &L (a) (се, + ca) соз 5 а + i(c, — с) упа). 

Noting that cos (т/2)а > 0 and putting 

= — L (a) (c, + ca) cos za (c > 0), 

= na 

PES (—1<в<1), 

we find that 

log f (t) = iq't —ct { 1+ ig кта) 
Fort < 0 
log f(t) = 1ок/(—)= i (—)—с(—0*{1— ite; e] 

= i't— c|tp (1277165 al. 


Therefore, for every t, 
log f (= itelte [E Br e a}. (14) 
2. | « a < 2. In this case, by changing the constant y (12) can be 


written as 


0 
d, j 5 аи 
log f (4) == ity + Cy [ (еч —1— itu) т 


v 
o 5 
du. 


+а [emit 
For t > 0, 


log f (f) = ity” + t | (e-i" — 1 + iv) eat 


P А ; dv 
-H Co | (ei" —1— iv) “ah. 
Ü 
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Applying to the integrals on the right side Cauchy’s theorem and taking 
the same contour of integration as in the first case, we obtain 


me Fy (2), 





A ра 0 20. Lit fe y—1+y) 


M 


Pt 


where 


no= fe y—ity) As. 


In exactly the same way, 
Pm ET А К dv Кы T aM 
e-t — +) saa = 6 (а). 
0 


In this case, putting 
c = — M (a) (с, + б) cos 7; a (c > 0) 


and attributing the same meaning to the number f as in the first case, we 
find that (14) again holds for log f(t). 
3. а = 1. Since 


со 


1 — cosz D: 
[| ^2 a =7, 


we find that for t > 0 
со со 
А itu Ndu cos a А иѓ du 
pens |— ra) gm [EM dati fin, Ta) 
0 


0 
со 


т sin fu | 1 
~ Pee m [Et] access 


€ 
ef 


=— Fei [е (Soa e E 7 sare] 








But 
et et 
lim | 55 dv= lim КЫ 
5+0 V e> 40 
Е € 
hence 
T PEN NN itu )5-—5 d Te 
Jen 1 [Law xt it log t +4 iT, 
0 
where 


r= | (тав) 4 
0 
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Since the first integral in (12) is the conjugate complex of the second, 
fort > 0 


log f () = iX't — (e, + с.) > f—i(c,— Со) tlog t. 
Fort < 0 
log f (£ = log f(—0- — ty’ (— й— (64-95 (— t) ++ i Ce, — с) 


== (— flog (— 0) = iy't— (e + с) 5 | t| — (с, — ca og |41. 
Putting 





T €5—C 
c= (с, + 63) 5, pev 


we find that for all £ 
Low 242 
log (0 = сав 2006101). (15) 

The proof of the theorem is thereby completed. 

We shall agree to call a in formula (1) the characteristic exponent of the 
stable law. 

At the present time an explicit form for a stable distribution function 
is known only in a few cases. Thus it has long been known that the normal 
law (а = 2) and the Cauchy law (а = 1, 8 = 0) are stable. Recently * 
N. V. Smirnov proved that the stable law for which in (31) а = 4, 8 = 1, 
y = 0, с = 1 has probability density equal to 

0 for x < 0, 
1 
pix; +, 1, 0, 1)= jp es 22 
( 2 yx 22 х 2 for x > 0, 


This law belongs to the system of Pearson curves (type V). 





$35. DOMAINS OF ATTRACTION FOR STABLE Laws 
Let the random variables 
n 
En ense eps 


be independent and have a common distribution function F(z). 
If for suitably chosen constants 4, and В, the distribution functions 
of the sums 


* Translator's note. This was also obtained by P. Lévy. Sur certains processus 
stochastiques homogènes, Compositio Mathematica, T, 283-339 (1910); see p. 284 
and p. 294. 
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converge as n — « to a distribution function V(x), then we say that 
F(x) is attracted to V(x). The totality of distribution functions attracted 
to V(x) is called the domain of attraction of the law V(z). From $33 it is 
clear that all stable Jaws, and only these, have (non-empty) domains of 
attraction. 

Obviously, the domains of attraction of two laws belonging to the 
same type coincide. Hence it is possible to speak of the domain of attraction 
of a type. 

One of the fundamental problems in the theory of stable laws should 
be the determination of their domains of attraction. In this section we shall 
give its complete solution. As an example of the forthcoming theorems 
we note once again the fact mentioned before: while the normal law 
attracts a very wide class of distribution laws, the domains of attraction 
of the other stable laws consist only of those distribution laws whose char- 
acter recalls the character of the attracting law. 


THEOREM 1.* The distribution function F(x) belongs to the domain of 
attraction of a (proper) normal law if and only if as X > = 
дї [ dF (x) 


БЕА Gy (1) 
x* dF (x) 
prc 


Proof. We first remark that if the variance of a proper law F(z) is finite, 
then F(x) first satisfies (1) and. second, belongs to the domain of attraction 
of the normal law. Indeed, for such a law, as X — » 


» [f ағ(х) < f x*dF (x) > 0 
jzj>X {zp>X 
and at the same time 


x? dF (х) > fe dF (x)>0. 
т|<Х 


From these two relations (1) follows. 
Furthermore, we put 


a= | xdF(x), «= f œ — a} dF (x), B= n4, 


Then for every т > 0a3n— z 


E | (х— aft (х) = 5 | (x —a)*dF (x) — 0. 
^ |21518, >un 


* А, Ya. Khintchine [521, W. Feller [27], P. Lévy [75]. 
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From Theorem 3 of § 21 it follows that F(x) belongs to the domain of 
attraction of the normal law. 

We can therefore confine ourselves to the consideration of laws F(z) 
with infinite variances. In other words, we can confine ourselves to the 
consideration of laws F(x) for which, as X — œ, 


xX 
[ағ (x) > o. (2) 
Sy 


We shall prove first that if (2) holds, then as X — œ 


X j X 
| f sare] =о| | ae]: (3) 


Indeed, let z(z) be a positive function which is unbounded as = — +оо 
and such that the integral 


C= | 2%(x)dF (x) 


is finite. Then by the inequality of Cauchy and Bunyakovskii 
X X 
> 2 * х 
[ | xdF (| =| | г(х) OLIO 
х —Xx 


X X x 
“ | earn | тағо) «c | zs aro). 
—Xx : "X 


From this inequality (3) obviously follows. 

According to Theorem 4 of $26 the law F(x) belongs to the domain 
of attraction of the normal law if and only if there exists a sequence of 
constants C, (C, — oo as n — оо) such that as n — oo 


n f dF (x) — 0, (4) 
Гар С, 
5l | мағ) (| xdF (ху) | ә o. 
n т< C, Iz C, 


By what has just been proved for laws with infinite variances the second 
condition can be replaced by a simpler one, namely, as n — оо 
n 
t 
Mrel<c, 


x? dF (x) — со. (5) 
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We shall first prove that (4) and (5) imply (1). To this end we note 
that since C, — © as n — oo, for every sufficiently large X an n can be 
found such that 


SX « C. 


For the sake of brevity we introduce the notations 


= | dF (x), H(X)=45 | x? dF (x). 
lat> X {т{<Х 


It is easy to see that 
Н (Ch + x (Ch) > A(X) 2 H (C44) — x (Cr), 
and also 
x (Cn) > x(x) > ®х(С„һ+)-. 
From these inequalities we find that 
ny (C4) > x (X) > (n +1) z (C441) : 








7 H(X 4-1 
zd MEN A Ca) mc) 097 BEN анс) + nr (Col 
By (4) and (5) the first and last ‘fractions tend to zero as n — хо and so 
(1) holds. 


We shall now prove that the condition (1) is sufficient, i.e., it is possible 
to determine a sequence of constants C, (C, — © as n — оо) for which (4) 
and (5) will be satisfied simultaneously. To this end we pick an arbitrary 
ô > 0 and denote by C,(6) the infimum of all X for which 


ny (X) «à. (6) 
Since by hypothesis 
f x? dF (x)= + оо, 


it is evident that С„(8) — © as n — æ. Ву (1), whatever є may be, for 
n > n(ô,e) 


aH (5 e: @)) > =, 


Since 
1 4 | 
H(>C,())==— | x? dF (x) 
(s n ) C? (5) NT 
\т|< —— 
«a x? dF (x) = 4H (С, @)) 
“n ere, © 


for n > п(б,є), 
nH (C, (8)) >р. 
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Therefore, for every ô > 0, as п э œ 

nH (C,,(8)) > co. 

Consequently, it is possible to pick a sequence ô, converging to zero so that 
nH (C, (8,)) > co as п оо. 
But by (6), 
ny (Cn (n)) < Èn -> 0 as п» оо. 
In other words, as n — оо 
n f dF (x) — 0 
а> С 


апа 


x? dF (x) — co. 


We have proved that (1) implies (4) and (5); the theorem is proved. 
THEOREM 2.* In order that the distribution function F(x) belong to the 
domain of attraction of a stable law with the characteristic exponent 
а (0 <æ < 2) T it ts necessary and sufficient that 

F(—x) _ су 
1) 1— F(x) zt Cy 
2) for every constant k > 0 
1~F(x)+F(—x) , 
I — F (Rx) + F (— Ex) 
Proof. According to the Theorem 4 of $ 25, in order that the law F(z) be- 
long to the domain of attraction of a stable law with exponent a (0 < а < 2), 


it is necessary and sufficient that for some choice of the constants B, the 
following conditions be satisfied: 


as х ә оо . (7) 


ke as X. — оо. (8) 


* B. V. Gnedenko [38], Doeblin [24]. 

T Translator's note. It should be noted that this theorem also gives a necessary 
and sufficient condition for F(z) to belong to the domain of attraction of any given 
stable law determined by the constants a, сі, and c; subject to the necessary con- 
ditions: 0 < a < 2, а + с, > 0, |с, — c| < cı с» (see $ 34). Also, the proper 
choice of the normalizing constants can be deduced from the proof. For B, this is 
stated explicitly. For A, it can be shown (see Theorem 4 of $ 25) that it may be 

© 


chosen to bez if т дЕ(х) if 1 <a <2;nI1m log f(1/B,) if œ = 1, where f is the char- 


—® 
acteristic function of P; and = 0 if a < 1. Cf. B. V. Gnedenko and V. З. Korolyuk, 
Some remarks on the theory of domains of attraction of stable distributions (in 
Russian), Doklady Akad. Nauk Ukrain. SSR, no. 4, 275-278 (1950), where the con- 
ditions 1) and 2) in Theorems are also given in terms of the characteristic function. 
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nF (Byx) > СЫ, (8 <0), (9) 

n(I—F(B,x))> & («> 0), (10) 
lim Tim n{ f LaF f xdF(B,x)) }=0. (11) 
e>0 л > со TP MP 


'The necessity of the conditions of the theorem follows from these rela- 
tions without difficulty. In fact, let y > 0 be large. Choose n so that for 
a given т > 0 

Bax «y < D, x. 
Obviously, 
F (— В.Х) < F(— у) < (== В„х), 
1—Ё(В„х)<1—ЕР(уу< 1—F (Bax), 
and for k > 0 
F (— kB, x) < F(— ky) < F (— By kx), 
| —F(B,, kx) < 1 — F (ky) <1 — F (B, x). 


IIence 
F (— Вх) <= F(—y») F(— B,x) 
1—F(B,x) * 1—F(y) “ІР (Bg 41x) ' 
and also 


] — F (Bn 41x) + F(— Basix) < 1—F(y)+F(—y) 
I— F(B,kx) + F(— B,kx) S 1—F(ky)+F (— ky) 


< 1 — F (Bax) + F (— В„х) . 
SS LF (By 1х) + F (— By aix) 
An application of (9) and (10) gives 





a AA 
udin < 


—F(y) Ё(— у) 
а 24 a ЛЖ ДАРЫЛА. ЖИН Г ЛАР ШЕ, 
к= $2. ск со = = Ss 


Sufficiency. 'The conditions of the theorem have meaning only if for 
every r 
1—F (x) F (— x) > 0, 


Le., 


Р{| > х};> 0. 


Hence it follows that the infimum of all x „о the inequality 
Р (|> x) <E PI (12) 


which we denote by Ba, tends to infinity as n — oo. 
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From (8) it follows that * for every с > 0 
n(l— F (xB,) -j F (— х8,)) 
=n (1 —F (B, + 0) + F(— B, — 0)) 00), 
n (1 — F(xB,) + F (— xB,)) 
=n (1 —F (B, — 0) 4 F (— B, 4 0) +). 


From this and (12) we conclude that as n — «o 





n (1 — F (xB,) + F (— xB,)) > t. (13) 
From (7) it follows that for every x > 0 
суп (1 — F (xB,)) =c,nF (— xB,) (1 +0 (1). (14) 


From (13) and (14) follow (9) and (10). It remains to derive (11) from 
(7) and (8). 
For this purpose we first establish that 


f x? dF (x) == + оо. (15) 
Let zo be so large that for a given e > 0 and k > 1, chosen so that 
k*-*(1— є) > l and k*(1 — e > 1, 


PIE AR КЕР 
РЕГ etin) we, (16) 


where |&| € efor s = 0, 1, 2,.... 
By the condition 2) of the theorem such ап хо can be found. Obviously, 


fe dF (x) = f star (+> f x? dF (x) 


ES En KS Vy, & | m | < kB 
eo 
> dF (x) У РЕ Ir. 
talc ay sal 


By (16) 
Р [ks-1xo [LIE] < лхо) =P {| |> 8-10} —Р (181 > Ё®х() | 
8—1 8 


> P(E > х) pete» [p od s Ша) 


=1 
PU x) (Ion 





Thus, , ү 
| x°dF (x) > | x? dF (x) 
lx] < To 





ЁР Ta ary. 1 — e) ge n, 
s=1 





1 
* TTranslator's note. Apply (8) with, e.g., x (n. + 1) for x ан} -for k. 
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Since Р{|#| > ze} > 0 for every zo and the series on the right side of the 
inequality diverges, (15) has been proved. 


Let хо be chosen as before. Choose n so large that 


x*dF (x) < [ 


x? dF (x); 
а [< а ж < | T| < В, 
this is possible by (15); then obviously 
x2dF (x) <2 f x? dF (x). 
|2|< By 


To «|z|« В є 
Let the integer s > 0 be such that 


E5xo « Bye < k*1xo. (17) 
We have 
8 
x? dF (x) < 2x2 У 0+0 P kre [lE] < kr*ix, | 
121< Ве r=0 
s S 
< 20У ер Ер Ax) < 2285 DPS p (itl 
т=0 r=0 
> hr xo}. (18) 
By (16) and (17), 


PILIS хо) <k A+ e)P {1E > art xS] 
«I1 -H e)l кеен P {| 2 ket x9) < [O + e) e] 7 P ([6] > Bn) 
From (18) we find that for ke? < (1 + e)? 
x? dF (x) 
Iz] < Ba 


«< 2Вїе* (1 4-е) k ** P {|| > eB, } Ха е) 7р" 


« 2Впе? (1 4-е) & ^ P {JE > eB, } ; 


1 
Т=@+ у 
Hence by (13), 


ым (1 + є) kate 
dmm fo eaP) 28 E үгү зуна 
% 121<+В,; 
and therefore 


lim lim = f x?dF (x) = 0. 
15 0лп->со Bn 


|2!<*В„ 
Now the inequality 
2 
л{ f eaw | хакодў}<& [o 09 
By Iz| <eBy [21 EBn n рт|'<єВ„ 


proves the validity of the asserted theorem. 
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THEOREM 3.* If the distribution function F(x) belongs to the domain of 
attraction of a stable law with exponent a, then for every 8 (0 < 8 < a) 
the integral 

fixt aF (x) 


exists. 


Proof. We consider first the case а < 2. Then, whatever e > 0 and 


k > 1 may be, according to the preceding theorem it is possible to find 
an zo such that 


P {lel > kexa} 





PO» C0). Oy 
where |e,| < efor s = 0, 1, 2,.... We suppose that ud « 1. Obviously, 
To E 
fIxitarco« t ftxiar ch + 2 f |x]èdF (x) 
а= 


Ts k57 Ire ра < koa, 


< J Ixl aF H xo У 8" P > t. (20) 


—To 


By (19), 


8—1 
P (1529) = [Ta 4-5) -°P (16 > x9}, 


hence 
2 кер [|t| 5 i7] < P (1 > хь} #* 2,0 4e)’ entem 


and so the right side of (20) is finite by the choice of e and &. 
For the proof of the theorem in the case a = 2 we consider the function 


Ņ (2) = f 2de f ар(ху+-2 [оао [ dF (x). 


{ш|< Ja[ >z |т|>® 


* Гог the case a = 2 see, e.g., A. Ya. Khintehine [52], Н. Cramér [21]. l'or the 
case a < 2 see В. V. Gnedenko [44]. 
1 Translator’s note. In the original an equality sign stands here. However, accord- 


ro 
ing to the convention used in this book (Chapter 1, $ 6) f |z|5 dF (x) stands for 
= хо 
|г|# dF(x). Hence an inequality sign is in order. We have not undertaken 
—20<т<то 


to check all similar formulas. 
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Obviously it is sufficient to confine our consideration to the case that 
the variance of F(x) is infinite. 

In this case the nondecreasing function V(z) becomes infinite as 2 — oo. 
By Theorem 1, 





oci foe | ЕЕ) 
[аї> е 


Let M(z) denote the supremum of v-*(r) in the interval 1 € v < z, 
where e > 0 is a given number. Then 


? 


2 2) du < M(2) E idv < ZO) 
1 





1 


and consequently, 
z-*y(z) = о (M (2)). 


From this we can obviously conclude that v(z) = о(г°) for every e > 0. 
But for 0 < 6 < 2 and all sufficiently large z > 0 
5 
> -1 
|x] è dF (x) < zz 9-94 Qz) <23 , 
2 <121<22 
so that 


= S ym (2-1) 

p [ |x|*dF(x) «2? Ya < оо. 

ИМРЕ = 
The theorem is thereby completely proved. 

If the normalizing coefficients B, in the sums 
{== fit f+ iwy Tí —A 
п Bn n 

are not chosen arbitrarily, but are subject to some restriction on their rate 
of increase, then it is clear that the domain of attraction under the given 
restriction can only be narrower than without this restriction. For example, 
the theorem of de Moivre-Laplace shows that in the investigation of the 
domain of attraction of the normal law the following choice of the normaliz- 
ing coefficients is of special interest: 


B, = aY n, 
where a is a constant. 

If V(x) is any given stable law, then it is easy to see that it belongs to 
its own domain of attraction. If а is the characteristic exponent of the 
law V(x), then the normalizing coefficients are determined by the formula 
15 
a 


B =n 
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We say that the law F(x) belongs to the domain of normal attraction of the law 
V(x), if for some a > 0 and some A, the following relation holds: 





lim А У) аА, <х| = у(х). (21) 
| 


"9 [gae km 
Here а is the characteristic exponent of the law V (x). 


THEOREM 4. In order that the law F(x) belong to the domain of normal 
r 
. 1 л a 
attraction of the law Ф(х) = Wax fe 2 dz, it ts necessary and sufficient that 
т 
—® 


it have a finite variance. If this be so, then necessarily а = о in (21), and 
A, may be chosen to be nfa dF (x). 


Proof. According to Theorem 4 of $ 26, for attraction to the law Ф(х) it 
is necessary and sufficient that for every e > 0. as n — co, 


п f dF (x) > 0, 


S 
zi> е8, 


m[ [f eaw- f ses 


LAE Tm IT| eB, 


(22) 


In case of normal attraction, the second of these conditions becomes 


tim | f eaw, f хар (х) ]- 9. 
DTP рдап а 1< eala 


By Theorem 3, the limit 
lim f x dF (х) = f x dF (x) 


пә оо 


lel <eaVne 


exists; hence 
f (x— f x dF(x)} dF (x) = c? (23) 
also exists. The converse proposition, that the finiteness of the integral 
(23) implies (22), is trivial. 
THEOREM 5.* In order that the law F(x) belong to the domain of normal 


attraction of the stable law V(x) with characteristic exponent a (0 < а < 2) 
and given constants c, and сз, it is necessary and sufficient that 


* В. V. Gnedenko [38]. 
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F (x) = (c a" + 2, (x)) =F for x <0, 


(24) 
F(x)=1— (coa* + а, (x)) —; for x > 0, 


where a is a positive constant and the functions a(x) and a(x) satisfy 
the conditions 
lim a, (x) = lim a, (x) = 0. (25) 
э о 


2 — с z 


The constant а in (21) and (24) is the same. 


Proof. Necessity. According to Theorem 4 of $ 25, necessary and suffi- 
cient conditions for the attraction of the law F(x) to the stable law deter- 
mined by the constants cj > 0, с > 0, and а in formula (12) of $ 34, 
are that as n — oo 

C1 
nF (Bx) > TT 


n(1—F(B,x)) > 





for x « 0, 
(26) 


© 
xa 

1 
Since for normal attraction В, = ап“, by putting y = B,z we can write (24) 
as 





for x > 0. 


F (y) = (c,a*4- a, Nor for y « 0, 


F(y) == 1— (са*-Е(у)) xz fo у> 0, 


where a;(y) and ao(y) are certain functions satisfying (25). The sufficiency 
of the theorem is trivial. 


$36. PROPERTIES oF STABLE Laws 


The preceding results enable us to record a series of essential properties 
of the stable laws. 

1. (P. Lévy [76], р. 201.) For every stable law V(x) except the normal 
and the unitary, there exist numbers a (0 < a < 2) and c > 0 such that 


lm ard ИО) V(—22)J =c. (1) 


Obviouslv, (1) holds for everv law belonging to the domain of normal 
attraction of a stable law with characteristic exponent a (0 < а < 2). 
Since each stable law belongs to its own domain of normal attraction, it 
also satisfies (1). 

2. (B. V. Gnedenko (38].) Every stable law with characteristic exponent 
а (0 < а < 2) has finite absolute moments of order 6 (0 < à < a).* 


г 


* Translator's note. It should be added that, on the other hand, all absolute 
moments of order > а are infinite. This follows from (1) and is needed below. 


§ 37] DOMAINS OF PARTIAL ATTRACTION 183 


This is also a consequence of the fact that every stable law belongs to 
its own domain of normal attraction. 

Hence,* in particular, it follows that among all stable laws only the 
normal law has a finite variance. For 1 <a < 2 the stable laws have 
mathematical expectations, for 0 < а < 1 the stable laws have neither 
variance nor mathematical expectation. 

3. (A. Ya. Khintchine [59], p. 101.) All proper stable laws are continuous 
and have derivatives of all orders at every point. 

Indeed, since for a proper stable law V (x) 


|v(0] e *"" (сро, O<a <2), (2) 
the inversion formula in our case can be written as 


t 


—1@{ту__ ec 


1 e tx 
у ә— Убу) =з | 7 v (t) dt. 


Differentiating this formula formally n times, we find that 
ун (ку = Co^. | f^—1 e—tt2 y (t) dt (3) 
== ama : 


The integral (3) converges absolutely, thus proving our assertion. 
The last theorem can be improved. 


THEOREM. (А. I. Lapin [71].) A proper stable distribution function with 
exponent a > 1 is analytic on the entire real axis. Fora > 1 it is an entire 
function. For a = 1 the radius of convergence of its Taylor series in the 
neighborhood of any point is not less than c. 


Proof. From (2) and (3) we deduce that for every real z 


Hence 


$37. Domains oF PARTIAL ATTRACTION 


In § 33 we proved that the stable distributions and only the stable 
distributions can appear as the limit distributions for normalized sums 
„=. +t 
By 
of independent and identically distributed random variables, as the 
number n of summands tends to infinity, running through all integral 
values. It may happen that the distribution functions of the sums ¢, do 


— Ag, 


* Translator’s note. Sce the preceding note. 
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not converge for any choice of the constants B, and A,, but that for some 
subsequence ж < na <- < ny, <- Шеге is convergence. The general 
theory permits us only to assert that this limit law is necessarily infinitely 
divisible. As A. Ya. Khintchine [58] proved, the incomparably deeper 
converse proposition 1s also true: every infinitely divisible distribution 
can appear as the limit for distributions of the sums ¢,,. 

We shall say that F(x) belongs to the domain of partial attraction of the 
law V (x) or, what is the same thing, of its type, if there exists a subsequence 
мт X mna <+++ x ny «c such that the distribution functions of the 
sums f», for suitably chosen constants B, and A, converge to V(x). In 
these terms we can state the preceding theorem as follows: 


THEOREM. Every infinitely divisible law has a (non-empty) domain of 
partial attraction. 


Proof. Consider the infinitely divisible law V(x) for which 








log v (f) = iqt-- fte = — tte dG (u). 
Since for the normal law the theorem becomes trivial, we may suppose that 
the function G(w) has a point of increase uo different from 0. Let a > 0 
be such that ат! < |uo| < a. Consider the domain A, defined by the 
inequalities 


ак < |u| < ak (Kcd, e 
and put 








With increasing k the domains A; are enlarged; hence 
O< STS.. STS.. 
Furthermore, put 
des | 5 a0), Bye f (1 4-29 dG (w), 
Bk Ak 
C,— f u (1+ 42) dG (u) , 
âk 


o == G( + 0) — G(— 0), 


and introduce the numbers ^, defined as follows: 
—1 
Aye, TE шр (1) (1) 
с? + — Р r=1 
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Now let the sequence of natural numbers 


GS Gp ...<@„<... 


increase so fast that 


lim g, Yi Le — o, (2) 
е kanti" 
1 з deg 1 3 
2,2 
img, * An? M д, [Cx [= 0. (3) 
n>m ket 
Also, put 
2 " 
Bx = ^ud. ° 


We shall now turn to the determination of a law partially attracted to the 
law V (x). To this end consider the infinitely divisible laws V,(x) for which 


log 4, (0 = | (ettu — 1) te 1 4G (и). 


åk 


Since according to (2) the series 27 converges, 
k 9% 


log f(t) = pH z log de (В) 


is the logarithm of some nisse Теа Putting 


со 


Tk 


q » 
k=on+1 К 


R,— 


we can obviously write 


n—1 


log /(z-) =; log Фк (E + 7.108 Ф (0 +008). — (4) 


But 


n—i n 


= ‚бк, 
D penga [е =. 








; Р у Вк е B8 
= eae Qe n 506g ee UES. 
n k= n 





С 
d Ру И = Уйа 
k=1 
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From (4) we now find 





n—1 
t „Оз £ l 
log /(у-)= it Nu а. 


m—1 1 3 
+ 0g, ? Zr 42 AE |C HOR), [8] <1 
k=1 


Using (1), (2), (3) we find that for every ¢ 
EN QdnUn-i1 [ә 1 e | 
пао) = it Sgt (оа) + log dn (4) 











1 3 
+ 0(9.К,) +9 (4, ° 1, * us.) 
x fp Bites age A ue f en E u? 
Bn 2 ч 
Writing 
410-1 аа (и) 
Yy b. 1+ | Tu 


we find that for every ¢ 
t ; 4 £ 
noeg) imi at 


+Í (e — 1 — 5) e dG (и) o (1). 


ân 








Therefore 


qn log f (=) itt, => 








=> in f (ri " тл) + ач 


This relation shows that as n —э æ the distribution functions of the sums 
ORA ELM 
Con = Ё — Yn 
n 

of independent summands &, distributed according to the law F(x), con- 
verge to V(r). We have therefore found for every infinitely divisible 
law V(x) а law F(x) belonging to its domain of partial attraction, proving 
the theorem. 

Soon after A. Ya. Khintchine proved this theorem examples of laws 
which do not belong to the domain of partial attraction of апу proper 


law * were constructed simultaneously by three authors (P. Lévy [76], 
p. 212, A. Ya. Khintchine [59], 3. V. Gnedenko [38]). The idea in the 


* As is easily proved, improper laws attract all laws. 
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construction of all these examples was the same — the consideration of 
a random variable with sufficiently large probabilities for large values. 
Consider the distribution function F(x) for which 
— e œ 


log f (t) = ! (cos tx — 1) d (qtr) + | cose —1)а(— igi): 


F(x) as an айу divisible law has a non-empty domain of partial 
attraction, but, as we shall see now, is itself not attracted to any law 
except the improper ones. 

We examine the behavior of the function log f(t) near the point t = 0. 
To this end we note the equation 


tog fi) = | si a (i) 


Y iti eo 
"^ dowd 1 {х 1 
= а іХ Е 
P | Sit (юе) + чё ao) : 
в 1 
Viel 
Since all the integrals considered are negative, 

тз ER 
Yiti [ti 


за) > | SP aes) 


e 


and 





АЯ Viel 
On the other hand, for sufficiently small t 

















Ant+3 
z © о BA 
27 ёх 1 У | zog of ( 1 ) 
a ik a EL EA T d 
f sin (ox) < | sin y а(==) < . sin? 7; ig 
1 x n=0 ini. 
ИП zie. 


Y iti 
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Now 
4n+3 ants | 
2101" 2111 
a(ags)< | 4( з) 
(рух) log x/’ 
4n--1 Ап +3 
2161 ^ БИГ 
so that 
4n+3 
o Тү" 
v f ' 1 1 1 
1 1 i 
4 (oes) < | 4 (is j- 7 8log |; 
nz iil. log x 2 Д БА 2108 577 gl | 
2141 2111 
and consequently 
со 


ее 1 
| sin 3 igx < 61817: 
1 


Viti 
From the estimates obtained we conclude that for ¢ sufficiently near zero, 
_ Als) 
log f (£) DE log |! P" 


where A(t) is a continuous function satisfying the inequalities 
1 
p< 400 <3. 


We shall now prove that for every choice of the constants B, > 0 
[А„ can be taken to be zero by the symmetry of the law F(z)] and natural 
numbers n; < < *:* < n, € + the function 


ny log f ( 5.) 
cannot converge to a limit uniformly with respect to ¢ in a neighborhood 
of t = 0. 

Consider all possible cases. 

1. The sequences {B8,} and [л] are such that 

im log 8, 
e NS ka < co. 

In this case for every t different from 0, the ratio n;/log (|t|/B,,) approaches 
— (1/a); while for { = 0 it approaches 0.* This means that in the case con- 
sidered the limit function for n, log f(t/B,,) cannot be the logarithm of a 
characteristic function. 

2. The sequences {B,} and {nz} are such that 


log B. 
lim Enk — + co 
k-»oo "К 


В . t \. 
* Translator's note. What the authors mean is that n; lost (5) is 0 for t = 0. 


nk 
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In this case, in any finite interval of t 
е t 
lim ny tog /(-) —0 
К-У oo Nh 


and consequently the limit function for f'«(t./B,,) will be the characteristic 
function of the unitary law. 

3. If the ratio log В, ‘п as К — œ does not approach a limit, then 
we can choose a subsequence n; from the sequence nx for which 





log B,/ 
lim "k | | 
ko n; Й < оо, 


and so reduce this last possible case to one of the previous cases. 

We have thereby proved our assertion. 

Doeblin [24], B. V. Gnedenko [39], A. V. Groshev [47], and P. Lévy [76] 
occupied themselves with the further investigation of domains of partial 
attractions. We shall cite some of the results without pausing for their 
proofs. 

1. Each distribution law F(z) belongs to the domain of partial attraction 
of one or a nondenumerable set of types or elsc does not belong to any 
do:nain of partial attraction at all (Doeblin [24], B. V. Gnedenko [39]). 

2. If a distribution law belongs to the domain of partial attraction of 
only one type, then this type must be stable (Gnedenko [39]). 

3. The domain of partial attraction of a stable type is wider than its 
domain of (complete) attraction (Gnedenko [39], Doeblin [21]). 

4. If the law F(z) belongs to the domain of partial attraction of the law 
V(x), and the law V (x) belongs to the domain of partial attraction of the 
law W(x), then F(x) belongs to the domain of partial attraction of the 
law Y (х) (B. V. Gnedenko [39]). 

5. From the result 4 it follows in particular that every law which 
belongs to the domain of partial attraction of a type with finite variance, 
belongs also to the domain of partial attraction of the normal type; that 
the only stable type with finite varience is the normal type; that the result 3 
is true.* 

6. There exist laws F(x) belonging to the domain of partial attraction 
of every infinitely divisible type (the universal laws according to Docblin’s 
terminology [24]). 


* Translator’s note. Tosce that result 3 follows from 4, we need the following fact. 
If L is a stable law then there exists an infinitely divisible law L, of a different type 
from L which belongs to the domain of attraction of L. If L is normal this is a con- 
sequence of the remark at the end of $ 30 and Theorem 4 of $ 35. If L is stable of 
exponent a < 2, let № be the normal law with mean 0 and variance 1. By Theorem 
5 of §35, Li = L*N is in the domain of normal attraction of L. Now let Le be 
partially attracted to Li. It follows from 4 that Ls is partially attracted to L but Le 
cannot be in the domain of attraction of L because it is partially attracted to Li. 
Hence 3 is true. 
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7. In order that the law F(x) belong to no domain of partial attraction, 
it is sufficient that (Doeblin [24]) ` 

dF (x) 

lim lim l2127X . 

1» оо X > co [ dF (х) 

[aj >X 

8. In order that the law F(x) belong to the domain of partial attraction 
of the normal type, it is necessary and sufficient that (P. Lévy [76], p. 113) 


xf dF (2) 
X > о xz 
x* dF (a) 
n 


9. In order that the law F(z) belong to the domain of partial attraction 
of the Poisson law, it is necessary and sufficient (A. V. Groshev [47]) that 
for every є > 0 





x 
Ed n 
lim |2112 # = 0. (6) 
[creo dF (Ix) 
|! z—1l <= 


According to result 5 just cited, every law belonging to the domain of 
partial attraction of the Poisson law belongs also to the domain of partial 
attraction of the normal type; consequently if equation (6) is satisfied, 
so also is (5). 


CHAPTER 8 
IMPROVEMENT OF THEOREMS ABOUT THE CONVERGENCE 
TO THE NORMAL LAW 


§ 38. STATEMENT OF THE PROBLEM 


The present chapter is devoted exclusively to the convergence of 
normalized sums to the normal law. In this connection we confine ourselves 
to the case of identically distributed summands. Moreover, we require 
the existence of moments for the summands: in all cases the second moment, 
and in a number of cases also moments of higher order. 

Some of the theorems presented here сап be proved also under assump- 
tions different from those we make, in particular, for nonidentically 
distributed summands. 

The basic problem, in which we are interested here, consists in the 
study of the asymptotic behavior of the difference between the distribution 
function F,(x) of the normalized sum of the first n terms of the sequence 
of independent random variables 


OM TET 


and the normal distribution function Ф(х). 
Throughout this chapter we shall suppose that 


Mt, — 0. 


This restriction, of course, does not diminish the generality of our consider- 
ations. The principle of the solution of the problem stated was indicated 
by Chebyshev in his fundamental paper of the year 1887 * [17], in which 
he gives the following expansion for the difference F,(x) — (2): 


т? 


ce 





Qi(x) , Q(x 0; (x) 
6) 09) ee + m det | (1) 
п n" 
where the Q;(x) are polynomials, the coefficients of which depend only on 
the first j + 2 moments of the random variable £,. 

At the basis of the expansion (1) lies the more general idea, also due 
to Chebyshev, of expanding an arbitrary function p(x) in a series of 
Chebyshev-Hermite polynomials. These polynomials were introduced by 
Chebyshev in 1859 [18]. We call them Chebyshev-IIermite polynomials. 
The name of Hermite, who discovered them much later, is adjoined to 


* Translator’s note. The year given is the year when the paper was first published 
in Russian, hence it is not the saine as that given in the Literature. 
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that of Chebyshev solely for the sake of distinguishing these from the 
great number of other important types of polynomials which justly bear 
or ought to bear the name Chebyshev. 

The Chebyshev-Hermite polynomials can be defined by the formula 


a d* 7 
Hy (x) = (—1)*е * oe EN (2) 
which gives 
Ho (x) = 1, 
Н, (x) = х, 
H, (x) = х? — 1, (3) 


The expansion of an arbitrary function proposed by Chebyshev in 1859 has 
the form 








: 1 ы с d^ — M E EE 
моу Ў LR T PE e ? H(x), (4) 
where k 
ey e (— 1)" f (хур (х) dx. (5) 
Since 


k 
Hy (x)= У һу 


j=0 


is a polynomial of the kth degree, the coefficients с, can be expressed in the 
form 


k 
Д 
cy = (—1)* У һу) (7) 
j=0 


by means of the moments 
a; — f p(x) x dx; j=0, 1, 2,..., k. (8) 


In the case of the probability density p(x) of a normalized random vari- 
able with 


ME = 0, р?Ё == 1 (9) 
the first few moments have the values 
a= 1, a, = 0, a5 =l, 


and hence the expansion (+) takes the simplified form 


reos (1 8 н) e 0)— (10) 
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Put 


r г 


Ф(х)= vu [Ге Жах. (11) 


Then for the distribution function F(x) it is natural to consider the 
integrated expansion (10): 


Р(х) — Ф (x) FBO? (х) + 55 89 (х) 4... (12) 
where , 
Ф® (х) = C n. Qe 


By successive integrations by parts it is not difficult to verify the equation 


1? 


[еа (x) = E f e7 Tds =(— ifte *. (13) 


Thus to the expansion (12) corresponds formally the expansion of the 
characteristic function 


RO =e |. 6-0]. (14) 


From this we can deduce expressions for the coefficients c, by means of 
semi-invariants. For this purpose we make use of the formal relation 
(see $ 15) 


oo 
log f, (f) = Уу-у Qty. 
f—1 
In our ease of a normalized random variable, 
%,=0, „=l, 
which gives 


со 
n Же: 
log (f) — — 5 -- 3, 5 tr. (18) 
r—3 
Putting w = —7t, we obtain from (14) and (15) 
"PL 
i+ ys w = "3 ; (16) 
r=3 
i.e., 
Cg =. 0" Хв, 
Са — х4, 
C5 = — 4p (17) 
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It is easy to see that the coefficient c, depends only on the first r semi- 
invariants. 

After these general remarks we now turn to the case of the distribution 
functions F,(x) of the normalized sums 


ЕЕЕ by 
"EU 


of identically distributed summands & with 


Mt, — 0, D?t, = «2. 


on = 


Putting 


we can write ¢, in the form 


ttis deo +h, 
yn 


Denoting the semi-invariants of ё, by xs, we easily find (see $ 15) that the 
semi-invariants of £; are 


G= 


№==0, №=1, A= for r2, (18) 


and the semi-invariants of ¢, are 


ud = 0, x = 1, 


xm = M z= Sr for r> 2. (19) 








The expansion (12) in our case takes the form 
F, (x) — (x) gy eng), (20) 
r=3 


where the coefficients сї” are calculated from the formal equation 


сї r X M T 
-el Ecce] om 


r=3 rin 
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§ 38] 


єт — — —% 
n 1? 


M 
n 
(22) 





After substituting the expressions for c? into (20), it is natural to collect 


seta thee 
terms of the same order in n. This then leads to Chebyshev’s expansion 
J (23) 


mU) p en Q2(*) , 1E 


а кс. 


| 


It is easily seen that 
Q (x) = |= (1 — x?) 
0M Т/л 1045 (2 3 (24) 
EXCIS eh [a us — — — |х, 
Qa (x) 6! тз ( 3 9 7 294 8 
А general method of calculating the polynomials Q, is contained in the 
following. Expand the right side of (21) in powers of 1/’Vn: 


со 
2 (k4- ^(k 4-2)! 
ek 


It is easily seen that P,(—w) is a polynomial in w of degree 3k with coeffi- 


Ak 4-9 2 k 
Et? (wkt (ук) _ - X^ Coo) (25) 


cients depending on №, №, ‚№: 
Р, (— w) = F(— 2) 
M 26 
Р ч) = wh ap ws = 
(27) 


Comparing the expansion 
со (n) оо 
Y 
14 P,(—1 ( 
T 2 к(— w) y5 
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with the expansion (20), we obtain 
со 
1 k 
F, (x) — 6 (х) ~ WP, (Ф (a 28 
n(x) — Ф (x) A^ (v (28) 
where P,(—4) is calculated Бу replacing и" by Ф in P,(—w). 
From (1) and (28) we deduce a general formula for finding the poly- 
nomials Ок: 


PU Qy()e ? = P,(— Ф). (29) 


Expansion of the type (12) was studied after Chebyshev by Bruns [13] 
and Charlier (15] from the standpoint of probability. Chebyshev's expan- 
sion in the form (28) was studied in detail by Edgeworth [25]. 

Те central problem of this chapter consists in the study of the asymp- 
totic behavior as n — oo of the remainder term 


az 
mea КЕ е го, (х) ; Ox (х) 
RE (х) = Р, (0) — Ф ()— S [$72 4. OEE) 


of Chebyshev’s expansion. The most definitive result in this direction, 
due to Cramér, is presented in $ 45. The case k = 1 is studied in detail 
in §§ 42-43, where results are obtained under broader assumptions than 
in § 45. In § 47 is given a theorem analogous to the theorem in § 45 but 
for probability densities. § 46 has a more elementary character: there we 
discuss the question of conditions for the convergence of the probability 
densities p,(2) of normalized sums ¢, to the normal density 





e 
y2x 
without regard to further improvements and estimates of the remainder 
terms. $$ 39 and 41 are auxiliary in character, and $ 40 contains a result of 
which the complete elucidation is given in $ 42. 


$39. Two AUXILIARY THEOREMS 


THEOREM 1. Let A, T, and e > 0 be constants, F(x) a nondecreasing 
function, and G(x) a function of bounded variation. [f 


1. F(— оо) = 0(— оо), F(4-c09)— G(4- оо); 
2. 0) — б(х)|ах < co; 


3. А' (x) exists for all x and (G'(r)| € A; 
T 
|f (0 — gf) 
А АЁ = z, 
E а. 
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then to every number k > 1 there corresponds a finite positive number c(k) 
depending only on k such that 


|F (x) —G(x)| < 47e (0) Ж. (1) 
Proof. First of all we note that from the equation 
f)—g (= f eitzd [F (x) — G (x) 
we can deduce by integration by parts that * 


fOe E [F (x) — О (x)] dx. (2) 


— it 


We note further that for the proof of the theorem it is sufficient to 
consider the case А = T = 1. Otherwise, indeed, we may consider the 
functions 


Р(х) = д 
апа Т Р 
G, (x) = 4 G(F). 


Evidently, 
1 


' filt)— n) Т 
| б, (х)| << 1, [12989 as Fae, 


—1 


If we suppose that the theorem is proved for the case 4 = T = 1, then 
| F, (x) — G, (х)| < k ste (k). 


Using the definition of Fi(x), Gi(z), апа є, we obtain (1). 
Consider the funetions 





sin = 
3 4 
H (x) = à p 
4 
and 
0 for [|22 1, 
2(1—4|tp? for $ «|t « 1, 


h (f = 
1—682--6|t8. for O «|t! 5. 





* Translators note. For t = 0 the left side of (2) is defined as the limit as t — 0, 
which is finite by the assumption 2. 
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It is easily verified that they satisfy the following relations: 

1. h(t) = f et» H (хах, 

2. f 1x1 дах — 0 < оо, 

3. f Н(хуах==1. 


Construct the function 
v(x) = f H(«—9IF0)— 60) dy. 


According to (2) together with Theorem 3 of § 11, the equation 
1080 , (y) = | eitzo (x) dx 


holds. Since the function on the left is absolutely integrable, 
u(x)= fe- ив 0) — 8 h (py at, 


Using the definition of the function v(x) and the fact that A(t) = 0 for 
|1 > 1, we find that 


1 
| н(х— (FO) — 00) ay = э: | е-и® © Екш, (3) 
—1 


We now put 
A= max |F(x)—G(x)| 


—с < 2 < 
Without loss of generality we may suppose that 
A= | F (0)— G (0)| 


and that F(0) > G(0). Under these conditions, taking into account the 
relation [G'(z)| < 1, we find that forO <x <A 


AD Р(х) —– а (х) 24 — |х|. (4) 
From (3) we conclude that 


|f ноуте) ооа |< ро. no [а 
—1 


т 


1 
Ze f |900 ш — „= 
у (4) it is obvious that 


эт > 21| fe» 0—rone| 


> fe-bpme-ne- f H (ху) 9) — G0) lay — 


=Ф (cont’d) 
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— f n&—91F0)—60)14» 


А 


ү те т 


— оо 
А 


+ (weap ау] = | (24 — y) H (x — y) dy — А. 


But 


jason» dy= f (2A — x — 2) H (2) dz 


—r 
А — х 
> (2А — х) f H(z) dz — flzlHwae, 
-g 
so that 


А — 2 
Е 


шоола С K^ 


Since 0 < т < A, we can find а 0 (0 < 6 < 1) such that т = A6. Thus 
the preceding inequality takes the form 


A (1—8) 


A (2—9 f H(z)dz — 1) <= -- b. 


— 645 


Whatever k > 1 may be, it is always possible to choose @(/) > 0 and 
a(k) > 0, so that 
(1— 8 (А) a (k) i 
(2 — 9 (&)) H(z)dz —1-—-. 5) 
TM v 2 Š 
Consider the two possible cases: 


1. A< a(k), 2. А> а (6). 
In the second case we choose r so that 6 = 0(k). It is then evident that 


(1—8 (k)) a (k) ^ (1—8) 


1 + $=(2—8(4)) f  Hedz<(2—0) f Н(гуаг. 


—a (k) 8 (k) 49 


Thus in this case, 


A <E hb <4 (kb -- a (#)). 
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In the first case, 


A <a(k) < $+ (kb -+ a (&)). 


Putting c(h) = kb + a(k), we complete the proof of the theorem. 
We need another theorem, similar to the one just proved, but in which 
it is not assumed that the function G(x) is continuous for all values т. 


THEOREM 2. Let A, T, e be arbitrary positive constants, F(x) a nondecreas- 
ing purely discontinuous function, and G(r) a function of bounded varia- 
tion. If 

2) (1FG) — 6 (х)|ах< œ, 


3) the functions F(x) and G(x) have discontinuities only at the points 
х = ox, (x, < „ы; = 0, &1, x2,...), and there exists ап l such that 
min (ziu1— 2.) > d, 
4) everywhere except atx = x, (0 = 0, +1, €2,...), 

|G’ (x)| « A, 


T 
5) f | £9 ate, 
“т 


then to every number k > 1 there correspond two finite numbers с\(Ё) 
and c2(k) depending only on k and such that 


є А 
ПРО) 0(х)| € 5z 4, (04, 
whenever T -l> ea(k). 
Proof. As in the theorem just proved, we may put A = Т = 1 and 
A= sup  |F(x) — G(x)| —|F(0)— G (0) |. 
—о < 2 < +оо 
The behavior of the functions F(x) and С (х) in the neighborhood of x = 0 
can be reduced to several cases, each of which must be examined individu- 
ally. We confine ourselves to the case that x, = 0 for every v and the 
distance from the origin to the nearest х, > 0 is not less than 1/2. If we 
put ô= min (А, / 2), then, as in the preceding theorem, we find that 
(for x = 66) 
(1—8) 5 
2 Ё 
5 (2—9 | H (y)dy — |р. 
98 
Гог an arbitrary А > 1 we can again choose a sufficiently small (А) and 
a sufficiently large a(k) so that (5) holds. If A > a(k) and 2 > a(k), then 
ô > a(k) and therefore 


A <А < k++ (kb -+ a (8). 
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In ease A € a(4) this inequality is obviously satisfied. Putting с = kb +e(k) 
and c(h) = 2a(k), we obtain the proof of the theorem. 


$40. Estimation oF THE REMAINDER TERM IN LvaruNov's THEOREM 


Before turning to the essence of the problem, we shall introduce some 
notations which will be adhered to in what follows. We put 


p=% =Ë and A, M, (1) 
A 


Obviously, ps and А, are respectively the sth moment and the sth semi- 

invariant of the random variable £ е. Therefore, according to (5) and (2) 

of $ 15, the following inequalities are satisfied: 
1 1 


О<р <р «p <... «ре <..., (2) 


[ds |< 59р, (3) 


Lyapunov in the proof of his theorem obtained not only the proposition 
about the convergence of the distribution functions of the normalized 
sums to the normal law, but also an estimate of the speed of convergence. 
In ease the existence of the third moments is assumed, the inequality 


__ lg nt 
| Fa (х) — Ф (x) | < cpg a 


holds, where c is a constant (IT. Cramér [19] proved that c ean be taken to 
be 3). 

The object of this section is to prove the following more definitive 
result, first obtained by H. Cramér [21] under some additional assumptions 
and in the form cited here by Esseen [26] and A. C. Berry [10]. (For the 
extension of this proposition to the case of nonidentically distributed 
summands sce Cramér [21].) 


1 
3 





Turorem 1. If the random variables ti, &...., En... have finite third 
moments, then 





|Р, (x) — (x) | < c- 


Yn , 


where c is a constant (in the paper [10] Berry proved that c ean be taken 
to be 1.88 *). 


We remark that the order of the estimate, which is given by this the- 
orem, cannot be improved in the general case, even if the existence of 

* T'ranslator's note. Berry's computation is invalidated by an error; sec P. L. Hsu, 
The Approximate distributions of the mean and varianee of a sample of independent 
variables, Annals of Mathematical Statistics, 14, 1 20 (1915). 
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moments of all orders is assumed for the summands. It is easy to convince 
ourselves of this by considering, for example, summands which are iden- 
tically distributed and take only two values: —1 and +1, each with prob- 
ability 4. At the point x = 0 the function F,(x), as it follows from the local 
theorem of de Moivre-Laplace, has а jump asymptotically equal to 1/ V 2rn. 
This obviously proves the remark just made. 

The proof of the theorem formulated above is based on Theorem 1 of 
§ 39 and the following proposition. 


THEOREM 2. If the random variables £y &,..., & are identically dis- 
tributed and have finite third moments, then for 


syn _ 
IA] <= Ts 


the following inequality holds: 








hO- T<] UTR Di 
Proof. Indeed, from 
==. е8 dF (x), 
we deduce that 
epee 
2 3 
о а wu 


Since according to (5) of $ 15 e$ = 6$ < f, 
t 24 
|/ l> ® 


z) is different from zero for |t| < 7’,. Therefore in this 


interval we may write 


and consequently i 


Р, (9) = e id (z,) i 





2 nta 
But 1 (+) Ani А 
nlog fig 2 ! 6Bi 
where а = E: log f (2) ] i 
2—0— 
B, 


* Translator's note. В, = V/nfs = Уто. 
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Consequently, 
ер 8|. 
foe Tie ems ||. 
Now 
в 11 £ IB — 
[е — 1| PE < Уат 1014" 
and Ji 
ag Ef (2) | < 735 
1 її 712129; 
Непсе ЕЕ —-x 17|. |8; ——— 
aes ee ы у 623 Y n 
lfa (t) —e | «e бз Va e "e 
Finally, 
Wc cnu к E CR = 
4 603 y n 4 3yn 43 4 Yn o f 
hence 
j Io тив et 
2.6 2 НЕ Ва 4 
|Һ (4) e Seayn" Ы 


Now the proof of Theorem 1 сап be obtained іп a few words. In fact, 
we put in Theorem 1 of § 39 


F (x) = Р„(х), G (х) = Ф (х), 


1 Va 
A= max Ф’ (х) | = QT-—TQ,———. 
See (*)| y 2x : 903 








* This estimate can be obtained as follows: 


8 ору) |= |07702 SP) | 


dz3 
By + 38,8, + 287 8, + 38,8, + 284 
z VI: < ay 
(5) 
Now 
3 
8 <8,” <В, 
hence 


Ba + 38,8, + 26; < 68, 
and consequently d3 
da 108 F(z) | < Ta. 
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According to the proposition just proved, 
LM SR 
fn (t) —e E | dt< Tos | е 4 dt= L = 


T 
im | t 6y 
Therefore by Theorem ! of § 39 








a 











[Fy C) Ф091 E LY t а рс а. н. 


$41. AN AUXILIARY THEOREM 


In $38 we obtained for the characteristic function of the sum 


C ho ee fn 
n 
y ns 


of independent identically distributed summands the formal expansion 


foe t + 3 en (7): 


In this section we deduce several properties of this expansion. 





[СнАР. 8 


(1) 


THEOREM 1. If in the sum (1) the summands have finite moments up to 


the sth order inclusive (s > 3), then for || € T,,* = 


b у m 


«AS. nq + ту 


Vn 
88р; 


holds, where су(8) depends оті! у on s; also, the inequality 


(b) |^ — e F XXE -)) 


о (8) SE (I eho) е 


on 





holds, where 5(n) depends only on n and lim 6(n) = 0. 


Proof of Theorem 1(a). 
By the hypothesis of the ipu we can write 


uf): =D GO ede GL) 


B the inequality 


(2) 


(3) 


(4) 


* Translator's note. The subscripts s and n are separate indices; Tsn is not to be 


confused with the 7, in § 40. 
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§ 41] 
where |91 < 1. But for |] € Tın, according to (5) of $ 15, 
ER a: ке 
HODHEN nca) zi 
B. © B. 8E] ^8 


Thus from the definition of o; and 8; and also (5) of $ 15 it follows that 


V 1 AAU : 1 1 үк 1 
14 < Ўт в | < ux) < ү00 
kai n k==2 
Consequently, 
10) ‚ 29. (5 
У (— 1) аа 


log f (g-) = log (14- U) = 


1«j€-7 
(9) 


(8| € )) 


Considering (4) as a formal expansion of U in a power series of t, we 


easily see that this series is majorized by the series 
1 
т 
vl (ay | 
№] В 
к= 2 а 


and the expansion of U? in powers of t is majorized by 








234 
k — 2j 
Therefore the series 
1 
ae а) 
RE Bn 


majorizes the sum of those terms in the expansion of U which contain 
powers of ¢ of order >s. From this we obtain by simple calculations that 


for |] € Ten 
a—1 


log (z,) = 
i {у= (уз) t Sir s 


From this we find that , 
ES 
m ыш e Me de x 9 S" n а, 
log fn (f) = 2 00 к! (у) + apa s! ( ) . 


Ae (ge) aa о, 570 (=) 
y (195] 0. 
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Put 


12 


8—3 
"Y VV e (OFT? о 2 ув 
V =log{e? (fn (tz)) \ ld (RF?) (y 3) 


k 
T9, Sh (+) - 


We expand e" in a power series of z, regarding z as a real variable |z| < 1, 
and ¢ and n as fixed. Then we obtain [сї. (25) of § 38] 


е0, 02) =1-- y Py Gb (F 2 +R), 
k=1 
where R(z) = O(z°-?) as z approaches 0. 
The series for V is ae by the series 


P3 3 121 v (е 2*8 k42; z ık 
Рз ру арс T t pue 
a ll va eran el ys] 
L 
со ( 8 tet? 
У хаа ES 
er k! n! 
We note that for 1 < k € s — 2, 
EDE 20 421 g_ st 
(4-2) S? RZ S| RI) 
and m 
Pk o <р, . 
Hence the series 1 k 
eo 8 
oF ep Vt E iat) (6) 
үл 4 R! Vn 


a fortiort majorizes the series for V. From this it is clear that the series for 
Ү! is majorized by the series 





22 k 
"rc e jsp? |tz| 
wor ey sep Sz (rn). G 
Li yn 
Now we estimate the remainder term F(z). First of all, we have 


3—3 


vk 1 
eY = Vath ys—2elV1 (I1 уг ) 


k=0 
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Furthermore, 
— 
=1 + У Pel) (Fa 
k= 6 ^ y= і, Yn 


where (г) contains only powers of z beginning with the (s — 2)nd. Using 
(6) and (7), we find that 


y +a (2), 


t k 
4 CES - jsp? || 
[о (2) |< Узви ӯ 7 X ier 
=s—2 -j 


Since |z| <1, we have for |t] < Tsn, 














1 1 
jsp? Mz] — Je? 0j s 
Jo «lxi (8) 
VR Т X BOB 
8p? 
and 
x 2 (y 
Ee Л (ss | fz | 2 = = = 8 
ГІ ^y^ y “Vn (s+ k—2-— jy 
[Т уа ВР Ё\ п п ke 
1 
SPs mmt 8 
= үл 
Therefore, 





8 sto? |tz 1 hoe ү 
le(z)| < <-% e( Ys -) Y c; |t. 


ј=1 
But for any geometrical progression with ratio а > 0, 


s—3 


2 aí < (s — 3) (a 4-а °), 
j=l 











so that a 
3(s—3) $ ,. [Spy IANT : 2 (s —: 
le) EE oF sean (28771) "(eg MD eg Lye?) 
3 
3 (s P 3) sp,? 8—2 | 4 |8 t 3(a — 2) 
< ki ston (Se sti | 
>` g =] | 
8 


one) 
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Now we estimate V*-?elVl. According to (7) and (8) 
1 8 
-2 YI -9, —T,.43(-2)7 l yer? —41yl 
С =) кт 


But, as follows from (6) and (8), 
3 
9 ї 
Pp || = 
| V| < 302 ——— е8 
yn 


From the inequalities obtained we deduce that 





3(e-2) 2 
| V]** 2 el! «eo ET л 
Consequently, 
t 
|К (2) < A [14s [А8 6-72 |t (6-2) eT], 
Since 
Br 8—3 а 
№ Ge? —[1 + Уро (25) 7° |2 18 ente-$ 
y=l 


we obtain the assertion of the theorem by taking z = 1. 

Theorem 1(b) is proved similarly, except that in the proof it is neces- 
sary to use, instead of the expansion (+), the following expansion in the 
neighborhood of { = 0: 


We shall not enter into the details. 


$42. IMPRovEMENT oF LyapuNov’s THEOREM FOR 
NONLATTICE DISTRIBUTIONS 


The basic object of this section is to prove Theorem 2; for this purpose 
we need the following auxiliary proposition (Esseen [26]). 


THEOREM 1. Zf the distribution function F(x) is nonlattice, then whatever the 
number w > 0 may be, there exists a function Хп) such that lim A(n) = œ 
пэ 0 


) (n) 


and 





dt — o( >). (1) 
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Proof. If the function F(x) is such that * 
lim PAOI <1, 
121 œ 


then the theorem to be proved becomes trivial. 

In fact, from Theorem 5 of § 14 and the condition (C) we deduce that 
to every e > 0 there corresponds a c(e) > 0 such that |f() X e <1 
for || > є. 

Hence, in particular, there exists c > 0 such that у < e for t >w 

Putting A(n) = n, we find that 





n 
ee -en [оро [L 
i« = dt — e-*^ log (v 
Now let 
lim РӘ [= 1. 
It 


Since by hypothesis the distribution considered is nonlattice, the equation 
ПО) = 1 cannot hold for any t # 0 (sce Theorem 5 of $ 14). Hence it is 
possible to define a function a(t) for t > w by the following equation: 


1— 760 mx ЛР. 


Obviously the function a(t) is continuous, nondecreasing, and by virtue 
of the condition lim |/(Ф)| = 1 satisfies the relation 
tn 


lim a(t) = co. 
t-— oo 
From the definition of the function a(é) it is evident that 


(п) a(n) (1– T 
t= | 24) AME gt 


w 


for every A(n) > о. If a(n) < Vn, we set A(n) = n. Then we obtain 


з Sc: i 
i f 4 1— 5) dt «e log „ (7). 
w 





* H. Cramér calls this Condition (C); the results of § 45 can be obtained under the 
assumption that Condition (C) is satisfied. 
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If, however, a(n) > Vn, then 
\(п) 


'<] TU ia zoe) е2. 


Let (а) be the inverse function of a(t). Evidently lim t(a) = оо. We now 
set A(n) = (уп). Then a(A(n)) = a(t(V/n)) = Vn. Furthermore, since 
a(n) > Vn, (Vn) < п and consequently A(n) < n. Therefore, in this 
case also, 


1 4^ n 1 
Ск Et 
Q.E.D. 
The theorem formulated below was proved by H. Cramér [21] under the 


assumption that Condition (С) is satisfied, and by G. Esseen [26] in the 
form presented here. 


THEOREM 2. If the independent random variables £i, 5, ..., E, are iden- 
tically distributed, nonlattice, and have finite third moments, then 
à 
Р„(х)— Ф(х)== £ MO) | gf 9 
^ vi ул Ту © 


uniformly in x. Here Qi(x) = ә (1 – 2°) = = (1 — т?) [cf. (1) and (24) 
of $ 38]. 
Proof. Putting s = 3 in Theorem 1(b) of § 41, we find that 


Pilit — P | 8(n) : 


fale #——у=е Tem qase t. (3) 


'The characteristic function * of the function 


P,(- 9) — BO) =e (1 — ate e 
is equal to (see $ 38) 
n t 


ws) e T =P (ihe *. 





* Translator's note. The characterislic function of a function F(z) of bounded 
variation, not necessarily a distribution function, is its Fourier-Stieltjes transform 
[edF (a). 
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Use Theorem 1 of § 39, and put there 


P(x) =F, (x). G(x) = Ф (x) J- —P, (— Ф), 


Тт: 
А = тах|0' (х) <+ æ, T=A(n) Vn 


[л (т) is defined as in Theorem 1 taking о = 1/2463]. 

Without loss of generality we may suppose that T > Tsn. (In fact, this 
inequality is satisfied for all sufficiently large л.) 

We estimate the DR 











— Tas T 
T Tan T3n 
According to (3), 
Тал 
Ale) t (n) fe а ee 
E Jaci (8 -- iS) е vans -»( 42). 
bn de: 
f T T 
| | |а Г feo 
-) 
Т Ton 
But 
T 


frog x [+ аде (у) 


and by the pode theorem 


[var 


The integral f is estimated similarly. Thus є = 0(1/Wn). An application 
-T 
of Theorem 1 of § 39 leads to the inequality 


(т) 


TEJ (or =o (2). 





Р(х) (x) — EB | c za p E о(-—), 


which proves the theorem. 
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$43. DEVIATION FROM THE LIMIT LAW IN THE CASE 
OF A LATTICE DISTRIBUTION 


We shall preface the detailed exposition of the following results with a 
brief intuitive argument which should clarify the reasons why Theorem 2 
of § 42 does not hold for lattice distributions. Suppose that the random 
variables £y can take only two values +1 and —1, each with probability 4. 
If n is an even number, then the functions F,(x) are discontinuous, with 
jumps at the points z, = v/ Vn, (v = 0, £2, +4,..., жп). According to the 
local theorem of de Moivre-Laplace, at each discontinuity point x the 





: | . NAA M 
function F,(x) has a jump asymptotically equal to Pd 2 (this asser- 
2v 


tion is a particular case of the theorem to be proved in $ 49). 

Consider in more detail the relative behavior of the functions F,(x) and 
Ф(х) in the neighborhood of a discontinuity point, say the point х = 0. In 
the interval (—(1/Vn), (1/V/n)) the function (x) behaves like the function 





X 
y 2r ся 2 
up to infinitesimals of higher order. Introduce the function 


5 (х) = DI —x-4 4. 


It is easily seen that up to infinitesimals of higher order than 1/Vn 
the equation 








| Je 2 хул 
Fy () — 8) =e sí у ) 


holds in the interval (— (1/ Vn), (1/v/n)). 

If we wish to write down analogous asymptotic equations for other 
values of z, then we must take into account the change of the slope of the 
curve y = (x). Thus we are led to the consideration of the difference 


Р, (x) — Ф (x) — D$? (x), 
where А 
DU Gye s(£*-)« *. (1) 


Y 2zn 2 








We now consider the general case of a lattice distribution. Let the 
possible values of the random variable & be x, =a+vh (v = 0, +1, £2,...) 
and the span A of the distribution he maximum. We put all lattice distribu- 
tions with the maximum span A into one class and call it the class L». If 


P =a +h) =p, 
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then by the tacit assumption Мё = 0 
p= > Ұр, = — 2 s 


T. h 
The possible values of the sums ¢, will have the form 
h 


Rs (2 





Put 
hnp 
e Yn 





a, = 


and consider the function 


S, (2) = bg tee Y ут. (3) 


The object of this section is to prove the following theorem (Esseen [26]). 


THEOREM 1. If &, £,..., & are independent, identically distributed ran- 
dom variables having finite third moments and ficlonging to the class Dn, 
then 


_ а? 
©; (х), 5{(х) 1 
= a — 4 
ыты) = тү кеу de 
uniformly in x. 
Proof. First of all we calculate the characteristic function of 
= 


D, (x) = ——- S, (x): 


TES 
d, (t) = | e'saD, (x)  — it | е, (x) dx. 


For this purpose we note that S(x) is periodic with period one and write 
the function S[(x + a,/h)e V/n] in the form of a Fourier series. By the usual 
methods it is easily found that 


5 (0 = У + sin 2т vx 
vel 


and that consequently 


(а үт) — Y L singo Vr Ge) 


y=1 


where 
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Therefore 


со т? 
2 it f oe. 
dn ес 0 T f e sin (та Vnv(x-4-a,)) dx 


t eos 1 ttr— 2 ул Yn (2+0. ) 
mb. dq ” ax 
to Yun ш ` ' 


where the summation is extended to all integral values v = 0. An easy 
calculation leads to the equation 


t de gi? Yn an — (tb Улу 
hec M el | 


y= — со 


y 


We now apply Theorem 2 of $ 39, putting there 


F(X) = Е, (х), G(x) = (x) + LAE (x), 
l= LA iii (x)}< +0 (v—0, +1, —2, ...), 


T=n> Tan = ye 


Then, whatever с;(К) may be, for sufficiently large n 


Th LV? e, (4). 


We now estimate the integral 


el 








inf T g(t) | dt. 


For this purpose we split it into three parts: 





-4o Vn -°У" T 
= | ‚ъ= f, a= | 50—00 |а. 
= —-уеуп А 


We may suppose that Тз, < ire V/n (otherwise the estimates will only 
be simplified). The period of the function |f(é)| is т; consequently, according 
to Remark 2 after Theorem 3 of $ 14, it is possible to find a number а > 0 
(not dependent on n of the condition Тз, ‘суп = 1 '24рзс), such that 
JOJ < e7^ for Т/с т € [t| € 7/2. Then in the interval 


To, & [t] & 2 Ип, 
vov 
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It is also obvious that in this interval for some c; > 0 
leI”. 
Therefore 
1 - yx 
Tin F TO п 


ea 


3n Ton 


"ш. 








By the definition of g(t) and Theorem 1(b) of § 41 


Тэ Ty 
| [507-0 at о (Fa) + | 


зп — Ту 





0 at. 
t 





But for 4 € Тз, and sufficiently large n, 














dy Bo |< 





Therefore, finally, 





We now turn to the estimation of e: 





fn (t) — g (0) Ín (t) — da (0) 
£&(0-—£0|, o (1 ij f | 0. la 


T ^" 
sin =n 
Vn 


=0(—)+ | | dale! 3a C lat 





2k+1 Vn 


=0(} )+Ў E ( | 2®—%е@уп а |, 


k=1 2k—1. 2+1. 





where 


Put 
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and make the change of variables ( = z + kr. Here we recall that 


/Ф= Ў е7", 
and consequently 
flie+ kx)” = е inh f^ (z) = е7 їка д9 Vn f(z). 


Therefore 


| 
! 


UR —ika 3 Yn on 
"|е 4 f (z)+ 





uag’? 
z-BRkINO 1 iwa o Vn o0. (rene 
U a 


1, = i rm hx dz. 


т 
But 
na? . 3 
EA (ERATEN 
amd v d 
y» 





— ikas Ут na, nme c 
= — Е е Р + O (e a 2 
uniformly in z (iz! < 7,2); hence 
To! 1 1 ; 
+> = mcns = =z 9122. 
(ате Fe | — 
= | ee 0 e ). 


Now, on the one hand, by Theorem 2 of § 40 








Y -4 snz? 
(f (2))^ —e 
| 2 + Ет ae 


® gen 
t—eln 
470, 


е. [ ine шошо 











= syn kn 
= > Уп 
On the other hand, 
Y F 
1 | |! —— спі f 1 s 
dx o TIEN k? кшш i) 
ШЕ 


Therefore 
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and since r = O(Vn), 





logn 
е; == о( п -) 
Therefore, finally, 
1 
E (ут) i 


An application of Theorem 2 of § 39 leads to the inequality 


| Fa (х) — G(x) I< °(—==) tela = (I). 


which proves our theorem. 


$44. THE EXTREMAL CHARACTER OF THE BERNOULLI CASE 


The results of the last two sections enable us to obtain some extremal 
properties of lattice distributions and to clarify the special role of the 
Bernoulli distribution (Esseen [26]). 


THEOREM 1. If the random variables £y &,..., En... are independent, 
identically distributed, and have finite third moments, then 
-Z 
im mex ИЕ, (а) 4 (х)—°—= $9 = st, 


пә со —со<т<-+-осо 


VE Ys 


where w(h) is equal to 0 if F(x) is а nonlattice distribution, and equal to 





if F(x) is a lattice distribution with the maximum span h. 


h 
20V 2r 
Proof. Indeed, for nonlattice distributions, as it follows from Theorem 2 

of § 42, 


Ras) = Ул |Р, 6) — i 2| — o (1). 


From the theorem in $ 43 we find that for lattice distributions under the 
conditions of the present theorem 
h (x + a4) aV ny — E 
R, = s (Sete л), +o(1). (1) 
TY Qn h 


From the definition of the function S(x) and its periodicity with period 
h/oVn, we deduce that 





82 


ui ee ллы B 1), 0 UM " 1’ 
Sey ы лла 
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Since 8, — 0 as n — oo, 


А h 
lim max Ж, = 
п» со —co«cz«o uum уох ' 





proving the theorem. 

Thus, the convergence of the distribution functions of normalized sums 
to a limit law is in a certain sense worst for lattice distributions. The follow- 
ing theorem in a way supplements Theorem 1 in the case of symmetrical 
distributions. 


ТпеокЕМ 2. If the random variables &, £,..., & are symmetrically 
distributed and satisfy the conditions of the preceding theorem, and if their 
distribution function ts continuous at the point x = 0, then 

1 


li F. — SS" 
pun aU MOS у 


The last inequality becomes an equality if and only if 


0 for x«—3, 


F(x)— for |x| <4, 


L 
2 
h 
1 for x25 
(Bernoulli scheme). 
Proof. As the preceding theorem shows, we may confine ourselves to 


the consideration of lattice distributions. Therefore, we have to find 
the maximum value of the quantity 


h 


2e ут 





Two cases may occur: 1) the variable £ takes values of the form vh (v = +1, 
+2,...) with probabilities p,; 2) & takes values of the form (v — 5)А (v = 0, 
+1, +2,...) with probabilities р,; in either case Zp, = 1. In the first case, 


oo 


oo 
о2 = 2 >) vhp, > 2h? 5 p, = R. 
y=1 


y=1 


In the second case, 
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In the first case, equality is unattainable: 


h 

PT <1. 
In the second case, 

h 

7 < 2; 
moreover equality is attained if and only if all p, = 0 for v #0 or 1. 
Therefore we have always 


P „шй ЛЫ 
` 2e V 2n ‘Son’ 


and equality is attained only for the symmetrical scheme of Bernoulli. 

Both conditions of Theorem 2 (symmetry and the continuity of F(z) 
at х = 0) are essential; the rejection of cither one of these conditions would 
cause the expression 


lim тах Vn | Fy (x) — 9 (x)| 


ъ-> co —оо<2< +o 


to become unbounded. 

Indeed, suppose that & can take the value 0 with positive probability. 
The example of the symmetrical variables taking only the values —1, +1, 0 
with the corresponding probabilities p?/2, p?/2, 1 — p? shows that the 
expression 


h 1 
2з Ил 2p y 2n 





(2) 


becomes as large as we wish for sufficiently small p. 

Similarly, if we discard the assumption of symmetry of the law F(x), 
we can also easily see by a simple example that (2) becomes unbounded 
for laws of the class La. Indeed, let & take only three values a — h, a, 
a + kh with the corresponding probabilities р, po, p». We choose a, pi, ps, 
and Ё so that 


М, =a + h(— p, + ёро) == 0, 
ME = a? + Заа? + A? ( — р, + Rp) = 0. 


Here р, and p: can still vary quite freely. Choose pi and p» so small that 
К?р» is sufficiently small; then the expression 


h 1 


ж» Von 2 V (р ра) + ips ро) — kP Pa) 


becomes as large as we wish. 
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$45. IMPROVEMENT OF Lyarunov’s THEOREM 
WITH HIGHER MOMENTS FOR THE CONTINUOUS CASE 


Now we shall formulate precisely and prove the theorem, discussed in 
$ 38, concerning the expansion of the function Ё„(т) in a series of poly- 
nomials. In this connection it is necessary to impose on the summands $: 
stronger conditions than those imposed in Theorem 2 of § 42. 


THEOREM, If the independent random variables £, ,..., En are identically 
distributed and have finite absolute moments B, of the sth order (s > 3) 
and if Condition (C), 


m. |/ (0|<1, 
1 > со 


is satisfied, then 


Р, (х) — Ф (х) 


"ys Pix Lt) Ы 
п? n 





uniformly in z. 


Proof. The proof of this proposition is based on the application of 
Theorems 1 of § 39 and 1(b) of § 42. We put in Theorem 1 of § 39 


F (x) = F, (x), 





о е KM 
n? n? 
= Ф(х) + Y m 
A= тах |G'()|«-Fes Тп", d 
—о< 02 < o 


"Һе characteristic function of the function G(x), as it is easily deduced 
trom the definition of P(—9) and (13) of $ 38, is equal to 


ti s—2 
-7 . 1 k "ON 
2 RE 2j 
801+ Yn an (hy). @ 
k=1 
Without loss of generality, we may suppose that T > Tsn, where T. is 


understood to be a quantity introduced in the formulation of Theorem 1 
of § 41. We estimate the integral 


T 
з= f ше dt, 
—T 
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where, according to (2), 
tà 8—2 
e er v P, it) 
gise * 1+ У 0. 
k=1 п? 


By Theorem 1(Ъ) of § 41 we easily find that 


T on 
| кш 1 | 
- ; 


8—2 
8n 








n 


Furthermore, from Condition (C) it follows by Theorem 5 of § 14 that 
the distribution of the variable & is not a lattice distribution; hence there 


А 1 
exists a c > 0 such that for | > =, 
88a p? 


lf (t)| < е-е. 
Then for |t| > Ten 


fn (I= Un (s) em. 
But 


f je mm [ш < f nO qt f 2014 


Tan ete T Tan CT Tey 1167 


1 
< де-еп log + 0 p =o 5 |· 
m "EM "S 


'Thus 


Now according to Theorem 1 of $ 39, 





F,() —e9)— Y 5-9) 


k 
k=1 


п? 


керө i=: (3) 





which is what was tu be proved. 

In the preceding section we have seen that even in the case s=3 a 
more complicated expansion of the function Ё„(ж) holds for lattice distribu- 
tions. In case s > 3 the last theorem cannot be extended in a considerable 
number of other cases. In case Condition (C) is not satisfied, i.c., if 


lim |f(D]— 1, (4) 


It |-> о 
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the order of the remainder term in the expansion of F,(x) turns out to 
depend on the arithmetical nature of the set of possible values of the 
random variable £&. We remark in this connection that (4) can be satisfied 
for a nonlattice distribution only if all the variation of the F(z) is con- 
centrated in a set of measure zero (Cramér [21], Theorem 7). For example, 
if & takes only the values +1 and +V3 each with probability 3, then its 
distribution is not a lattice distribution. Its characteristic function 


70) = > (cos¢-t costV 3) 


(as an almost periodic function) satisfies equation (4). Simple calculations * 
show that for even n the function F,(x) has a jump at x = 0, asympototi- 
cally equal to 2/rn. This obviously means that even though all the mo- 
ments of F(x) are finite in our example, it is impossible to write the expan- 


sion 
a? 


2 
Р, (0) Ф (9 [| 999 309. | (7). 


7 æ 


n 


We thus see that in the case of discrete distributions it is necessary to 
supplement the expansion (1) with discontinuous terms. 


§ 46. Limrr THEOREM FoR DENSITIES 


If the random variables to be summed are continuous (i.e., if they have 
probability densities), it is natural to seek the conditions under which 
not only the distribution functions of the sums converge to a limit, but also 
the densities of the probability distributions of the sums converge to the 
density of the limit distribution. We see that the second requirement is not 
a consequence of the first by considering the following example. 


* Here is a sketch of these calculations. It is evident that 
2r 
уш” = в УС (ее yk a paca om 


The magnitude of the jump of F,(x) at z = 0 is equal to the coefficient of e°. The 

summands containing e? are obtained only by multiplying together the middle 

terms of the expansions of (ei'4- е1)? and (eitv3 + e7itv3)?r^s for even s. 
Therefore, the required jump is equal to 





Ж (C) 
NT x E к. 2r 
Po = 47 Cap Cox Lok) = c Y (Cee 42" ` 

К= д ne 
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Let the distribution function F(x) be defined by means of the equation 


Р(х) = f р(г)аг, 


where 


0 for |x| > ES 


е 
p(x)= 1 
— fo 
2| x | log? | x| 


r |х|< = 


Since the random variable £ with such a distribution is bounded, the 
function F(x) belongs to the domain of normal attraction of the normal 
law. In other words, if £i, &,..., & are independent random variables 
having Р(х) as their common distribution, and 


1 


е 


o? = D? t, = МЕЁ = | ——©——4@х, 
0 


log?x 


then as n — 0 


Bh. +Һ 1 of 
Р{ DV «x^ vx fe dz. 


We shall show that the probability densities of the sums 


ук ® К th) 


do not converge to the density of the normal distribution as n — oo. 
Indeed, the probability density of the sum £ + £ is 
] 


pa(x) = [р (z) p (x — 2) dz. 


е 
We shall consider only those values of the argument x which are near the 
point x = 0 (in particular we shall suppose that |e! < 1 0), and for the 
sake of definiteness we confine ourselves to considering only positive values 
of x. Under these conditions, 


р» (х) > | кесше 


— & 
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Since the minimum of the function p(x — z) in the interval 0 < || < x 
is attained at the point z — 0, 


z 


"oe EN S 
2 V7 7. 2x Mog? x | 2|zilog?|z| 2x|log)x]| ` 


In exactly the same way, it is easily seen that the probability density 
p(x) of the sum & + £s + & satisfies the inequality 
Сз 
рз(х) > Paral 
where сз > 0 is a constant,in a neighborhood of the point z = 0. 

In general, the probability density p,(x) of the sum &+%&+---+&, 
satisfies the relation 


Pa (X) > ——2—— (c, > 0) 


| x log? *1 |х || 


in a neighborhood of the point x = 0. 

Thus for every n the function p,(x) is infinite for х = 0. This means that 
p«(x) cannot converge to the density of the normal distribution under any 
normalization. 

The example above compels us to search for sufficiently general condi- 
tions under which the probability densities of the sums converge to the 
density of the limit distribution. 


THEOREM 1.* Let the random variables of the sequence 
Ea ccv med 


be independent, identically distributed, and have the probability density 
p(z).t If 

1) for a certain т 21 the probability density ps (x) of the sum & + & + 
+ E, ts integrable in the rth power (1 < r < 2) [as we say, belongs to the 
space L'?], and 


2) f p(x) ах « - oo, 


* Translator’s note. Theorems 1, 2 and the Theorem in $47 are simplified versions 
given in the Hungarian translation by I. Foldes (Budapest, 1951) of the Russian 
book. 

1 It is sufficient to assume that the density р„(х) of the sum & + £x +°- + £s 
exists for some m > 1. 
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then the relation 
a? 


Vn. „(с Vnx)o yx (п — co) 


holds uniformly with respect to x in the interval (— œ < x < c), where 





92 == f x? p (x) dx. 


Before turning to the proof, we shall make two simple remarks, which 
give us some idea about the class of distributions for which the conditions 
of the theorem are satisfied. 

Remark 1. If for a certain m > 1 the density р„(х) satisfies a Lipschitz 
condition of order и (0 < a < 1), then the first condition of the theorem 
is satisfied. First of all, the fact that р„(х) satisfies a Lipschitz condition 
implies that the function р„(х) is bounded, and consequently that 


fen (x) dx «co. 
for every r > 0. 

Remark 2. If the probability density p(x) is a function of bounded 
variation, then the first condition of the theorem is again satisfied. To 
prove this we shall show that under this assumption the function р(х) 
will satisfy a Lipschitz condition. In fact, applying to the integral 


pa (х) = | p(z)p(x—2) dz 


the formula of integration by parts, we find that 


pa (x) = f F (2) dp (* —2), 


where 
F (x) = f» (г) dz. 
In exactly the same way, 
pa(x +h) = f Fe apte — 2). 
Thus 
[ж (х ®)— ра(х)|< 120+ 0 Р) арб ә. D 


Since the function F(x) has a bounded derivative [otherwise the variation 
of р(х) would be infinite] F(x) satisfies a Lipschitz condition of order one. 
The inequality (1) shows that the density p.(r) also satisfies a Lipschitz 
condition of order one.* Q.E.D. 


* We recall that the total variation of the function p(x) is equal to f |dpz)]. 
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Proof of Theorem 1. 
Let 


0) = f ett р(х) ах, 
{һеп 
F(= [f ep, (x) ax. 


It is known that the first condition of our theorem implies the integrability 
of the function |f(t)|" for all n satisfying the inequality 


(see, for example, Titehmarsh [92], Theorem 96, p. 74). Therefore for all 
n satisfying this inequality, we can write the inversion formula as follows: 


2xp, (х) = f e-itz f^ (f) dt. 
We put B, = сул. It is easily seen that 


27B, p, (B,x) = f e-tzafn (=) dz. (2) 
Since S 
gi P 
ка d podecg ge. 
V on on 


to prove the theorem it is sufficient to show that as n — oo 


R,= f o el) e| dz 0 


uniformly with respect to х (— o» < х < o). To this end we represent №, 
as the sum of four integrals: 
A 


zi А 
1 == f e- as [fn(g-) е * | as, h= -f e- izt- 5 dz, 
—А і21> 4 
h= | e-ta fn (3. )as, L= f е-{ fn (z-) dz, 
A«iz1«tBn i 121» Bn ә 


where the numbers A > 0 and e > 0 will be chosen later. 
Since according to the second condition of the theorem F(x) belongs to 


the domain of attraction of the normal law, whatever the constant A may 
be 


1,70 


as n — оо, uniformly with respect to x (— oo < x < c) 


* Translator's note. In the original, m is missing from the formula. 
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By choosing A sufficiently large, |/+| can be made as small as we wish. 
Ву the second condition of the theorem the function /(f) has continuous 
first and second derivatives, and 


d d? Я 
ало) on. [a], --= 
Hence, іп a neighborhood of the point ¢ = 0, 
f(0 —1—57 +0 (8). 


If є > 9 is sufficiently small, then for || € є the remainder term can be 
made less than c??/4 in modulus. Thus for |] < є, 


2 ine 
OESE See +. 
From this it follows that 
ono 


II« | 


А<1г1<еВһ 





foe е2 f “аа fo Fay 


Thus if A is sufficiently large, |Js| can also be made as small as we wish. 
Since |f(t)| = 1 for t = 0 and f(t) — 0 as t — œ (as characteristic function 
of an absolutely continuous distribution function), it is possible to find 
ac > 0 such that |f(t)| < e for |t| > e. Let 8 > mr/(r — 1) bea constant, 
then 


со 

t 

еее f (Е 
eBn 


Since the integral on the right side of the inequality converges, as n — © 
1,20. 


Q.E.D. 
If we make use of the lemma which will be proved in § 50, then it is 
possible to generalize the theorem above as follows: 


THEOREM 2.* Let the random variables 


£1, $2, ..., &n, - 


be mutually independent, identically distributed,and have the probability 
density р(х). If 
1) for a certain m > 1 the probability density р„(х) of the sum 


++... 


is integrable in the rth power (1 < т < 2), 


* Translator’s note. See the note to Theorem 1. 
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z 


2) the function F(a) = | p(z) dz belongs to the comain of attraction of the 


stable law W(x), the characteristic function of which is defined by the 
formula (1) of $ 34 (о < 2), then the relation 


Вр (Bax + An) — p (xi а, 8, т, 0 0 (n— co) 
holds uniformly with respect to x in the interval (— оо < x < oo), where 


p(z; а,8,ү,с) = Y'(x) and the constants A, and B, are such that 
Р 6T && + = tin — An < x) W(x) (n— co). 
n 


$47. IMPROVEMENT OF THE Limit THEOREM For DENSITIES 


We now make somewhat stronger assumptions than in the preceding 
section, but on the other hand we obtain an expansion of the density p,(x) 
similar to that which was found for the distribution functions in § 45. 


THEOREM.” Let the random variables of the sequence 
Bis MTM apes 
be independent, identically distributed, and have the probability density 
p(x). If 
1) for a certain m > 1 the probability density of the sum 


++... 
is integrable in the rth power (1 < т < 2), 
2) for some k > 3 (k an integer) 


f iso coax < со, 


then k= eae 


В.р, (xB,) = (х) ++ У = P,(— Ф) о (a ==) 


=1 2 
s=1 4? 


uniformly with respect to x (— © < x < оо), where 


d 
P, (—9) = Р, (—®) 
and 


В? =n f x? p (x) dx. 
Proof. We know that (see (2) of $ 46] 


2nB,p, (xB,) == f етүп (=) dz. 


* Translator’s note. See the note to Theorem 1, § 46. 
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Since [see (2) of $ 45] 


P, (i)e ? = f P, (ах, 


by the inversion formula 


2P, (— 9) = f e 


its 


Р 
тер (if) dt 


and consequently 
wa tay 1 * 
anfe + YS Po] = | ere (да 
9=1 p? 


where for the sake of brevity we have put 








к—2 
-£ vd oi 
в (0) =e ? [ sp S P, (it. 
$21 n? 
Thus 
m k—2 i 
05 "i Я 
R, = 2т [вов e E + yer] 
a=l п? 
== i іга ie, oe | 
| е [P (3) g (2) | dz. 
Let Т = Bis and represent R, as the sum of the following three 
integrals: 
Tak 
L= | e iza [^ (6) — 8 C) |a, 
л 
—T hk 
l} = | e iz?g (2) dz, l= | e^ iesfn (2) аг. 
1z! > Tak та> Ток g 
By Theorem 1(b) of § 41, 
Tak 12 k—2 
UR SES [nee pnmo е tam o(n э). 
n 2 — Ток 
For lz| > Tre, 





where c > 0. Hence * 


idee f op) 


Iz T, к 





* The number £ is defined as in $ 46 in the estimation of the integral Z4. 
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Finally, it is obvious that 


a< f рў 


8 
а> Tak $—1 „2 





P, (it) | | dx = (u^ 3) 





n 


The estimates obtained prove the theorem. 

The particular case of the theorem just proved, in which it is assumed 
that the density p(x) is of bounded variation, was proved by Н. Cramér 
(19]. 


CHAPTER 9 
LOCAL LIMIT THEOREMS FOR LATTICE DISTRIBUTIONS 


$48. STATEMENT OF THE PROBLEM 


If each summand £, can take only values of the form 
x — Sh -]- a, 
then the sum 
(mdr... 
takes only values of the form 
Zng = Sh + na, 


i.e., the distribution function F,(z) of the sum ¢, is constant in each half- 
open interval 
235 <2 < 21+. 


It is impossible to approximate such a distribution function with 
continuous (not to say analytic) functions to within one-half of its maxi- 
mum jump: 


8, (s) = Fy (Zas t 0) xx F Cens). 


However, the real interest lies, of course, only in the study of the values 
of the function F,(z) at the points г„„ themselves, i.e., the sums 


Fa (Zn) = Fn (2»а-1-Е 0) = M $, (r), 


where 


8, (r) = P (C, = Zar). 


No less interest lies in the study of the probabilities P„(s) themselves, 
the probabilities with which the sums ¢, take different possible values 2;;. 
Moreover, it seems most natural to investigate the asymptotic behavior 
of the probabilities ф,(ѕ) as the primary problem. As is done, for example, 
in elementary textbooks of the theory of probability, the local theorem 
of Laplace is proved first and the integral theorem of Laplace is deduced 
from it as а consequence. It is possible to deduce in a similar way the 
integral theorem of $ 43 from the local Theorem 2 of $ 51, to be proved 
later. Generally speaking, such a method of proving integral theorems on 
the basis of local theorems may result in some loss in the precision of the 
estimates of the remainder terms as compared with direct methods of 
proof. However, for the first orientation in the problem the approach 
from the direction of local theorems seems preferable. This whole chapter 
231 
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will be devoted to the asymptotic estimation of the probabilities B,(s) in 
the case of identically distributed independent lattice summands £,. 

In § 14 we have proved that every lattice distribution has a completely 
determined maximum span he. From an arbitrary span h of the given 
distrij1tiom, the maximum span ho is obtained as follows. Form all possible 
differences у " 

s'—s 
of the indices s’ and s” for which (with the span A) the probabilities 
8, (5) — Pt, — s'ha) and 8,6") — P („= s" + a) 


are positive; find the greatest common divisor w of all these differences and 
put 

hg = oh. 
In accordance with this, in order that the span h itself be maximum, the 
following condition is necessary and sufficient. 

(w). The greatest common divisor of all the differences s’ — s" for which 
both probability 3Bi(s' and probability P:(s”) are positive, is equal to 
one. 

Naturally, it is sufficient to consider for each lattice distribution its 
maximum span, i.e., it is possible to confine our consideration to the case 
where the condition (w) is satisfied. 

On the other hand, for any given span A the transformation 





' En а ' kn — па 
p = in C 22—00 


m h , 
allows us to reduce the general case to the case in which the span is equal 
to one, while the constant a is equal to zero. Combining the last two remarks, 


we see that in essence it is sufficient to consider the case of summands £m, 
taking only integral values s with 


8, (s) =P En = 5), 


subject to the condition (w). 


$49. A LocaL THEOREM FOR THE NORMAL Limit DISTRIBUTION 


In accordance with § 48, we assume that the random variables £, take 
only integral values. The sum 


t — 5d d see Я 
can also take only integral values. We put Pff, = k} = B,(k). It is clear 
that for every n 


2 8, (k) — 1. 


We assume also that 


Mi, =a, Dp? == 2 
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and introduce the notation 


| k—A, _k—an 
ak = B, E сүп Ы (1) 








The object of this section is to prove the following proposition: 


THEOREM. Let the independent identically distributed random variables 
Ey Б... Ё... take only integral values and have finite mathematical 
expectation а and variance о? з 0. In order that the relation 
- Zak 
Ра "—0 (n — со) 

hold uniformly with respect to k in the interval — oo. < k < oo, il is neces- 
sary and sufficient that the greatest common divisor of the differences of all 
the values of Ё, taken with positive probabilities be equal to one [Condition 
(w) of § 48]. 





The particular case of this theorem for variables £, taking only two values 
0 and 1 with probabilities respectively equal to p = 0andq—1— p #0, 
forms the content of the classical local theorem of de Moivre-Laplace. 


Proof. The characteristic function of the sum £, is 


f'(—Mé— X ang, qo. 


Consequently, $,(k) can be calculated by the formula for Fourier coeff- 
cients: 


on $, (k) == pfe (t) e —i'* at, 


By (1) 
k = ZakBn + An = zB, + An 


(in what follows we shall write z instead of 2.1, omitting the indices); hence 


278, (k) = f e—itBnz—itAn fn (1) dt | e—itBne уап (1) dt, 


where we have put 
f* (y et f(A). 
Finally, making the substitution x = 13, we find that 
n, 
; x 
2x BS, (Kk) = [ eis ре" (55) ах. 


—rB, 
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Moreover, we know that 





2 1 T NES 
= 5 [ ec 3 dx. 


1 
ё 
Vu 75 | 


Our problem consists in proving that as n — oo, the difference 
2? 


R, = 2т [4.8 (k) — m ЗЫ 


tends to 0 uniformly with respect to k in the interval (— о < k < оо). 
To this end we represent R, as the sum of four integrals: 


R,—=h+hteth 





where 
A Z 
; x 
i= | erm) е "as, 
A 
Жа ize pen ( X Tm —izrzfsn( X. 
h = f eiza f "(gojas h= f e— taf а) dx, 
Ag|2)<eBy Вр < |2| < кВ» 
: 2% 
—izz—-— 
== — e dx. 


12|> А 
Here А апа є are positive numbers which will be chosen later. 
Since 


со а? 
dez fe бах а [m бах e 2 
A 


by choosing A sufficiently large we can make the integral J, as small as 
we wish. 

According to Corollary 2 to Theorem 5 of $ 14 and the assumption that 
the maximum span is equal to one, for every given e > 0 it is possible to 
find a c > 0 such that for eB, < |z| < 7B, 


|у" (x) < enc, 
Therefore, 


| е" [ dx < rB e~" = 2ro Vn e—™, 
eBn | rl<nrByn 


and consequently for fixed e > 0 the integral J; tends to zero as n — œ, 
uniformly with respect to 2 (— 0 < z < oo). 

Since the function F(x), having a finite second moment, belongs to the 
domain of normal attraction of the normal law, 


а? 


/*"(в-_)=>е * 
n 
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and consequently for fixed A 
1-0 
as n — oo, uniformly with respect to г. 
To estimate the integral 7; we remark that since F(r) has a finite second 


moment we can write the following expansion for the function f*(t) in a 
sufficiently small neighborhood of the point ¢ = 0: 


log /* (6 — ¢[ Flog MO|,_,+$| gto rt]. +o) 





Thus, in the neighborhood of the point t = 0, 
— soit 


f* (t) =e 
From this it follows that for sufficiently small e > 0 the inequality 
1208 
palse 





holds in the interval |4 € є. Now 


A? 


al< "ate ef. Pa Pasaje at e Se x 








We see a by choosing e sufficiently small, and A sufficiently large, 
it is possible to make 7» and Г, as small as we wish (their estimates do not 
depend on n). The integrals Г, and J; tend to zero as n — oo, whatever A 
may be. Hence it follows that the whole sum J, + I; + Is + J, becomes as 
small as we wish for sufficiently large n, which proves the sufficiency of 
the conditions of our theorem. 

The necessity of the conditions of the theorem is obvious from the fact 
that if the greatest common divisor w of the differences of the possible 
values of £, is different from one, then the possible values of t, (n = 1,2, .. .) 
will contain systematic gaps: the difference between two consecutive 
possible values of the sum ¢ cannot be less than в. 


$ 50. А Locau Limit THEOREM FOR NON-NORMAL STABLE 
Limit DISTRIBUTIONS 


We assume that the distribution function Fi(r) of the identically 
distributed independent summands £,, taking only integral values, belongs 
to the domain of attraction of the stable law G(y). Morcover, let 1, and В, 
be constants for which 


Р, (Вау + An) = P528 < y) => 00). 
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It is natural to raise the question of the applicability of the corresponding 
local limit formula 


В„8„(®) = B, P (= 0)  g (57529), 





where 


є (у)== О' (у) 


is the density corresponding to the distribution function G(y). 
The theorem of § 49 gives the answer to this question in the case where 


? 


1 
(у) = eO) е 2, 
Аһ=0, В, = ү һр. 


Naturally the answer to the question in the case of an arbitrary normal law 











1 00-а)? 
ү е 202 
& (у) Ves 
with constants 
| nD’ Em 


À, =n (Mi — a), B, = g 
сап be reduced to the same theorem. The case of “non-normal” attraction 
to the normal law (sce $ 35) remains an unfinished study (B. V. Gnedenko 
[43]. The complete solution of the problem in the case of a non-normal 
law G(y) is given by the following theorem (B. V..Gnedenko [45]): 


THEOREM. Let the independent identically distributed summands Em take 
only integral values, let g(y) be the density of a certain non-normal stable 
distribution, and let A, and D, be certain constants. [n order that the 
relation 


B,S,, (k) —g (2%) +0 


hold uniformly with respect to k, it is necessary and sufficient that the 
following two conditions be satisfied simultaneously: 

1) Fí(Bay + An) = G(y). 

2) Condition (w) of § 48. 


As examples we shall cite two particular cases of this theorem for 
specified distributions and specially chosen normalizing constants Bn. 


1. (Local theorem for the Cauchy law.) For a certain constant o > 0 the 
relation 


оп® (Е) — ————,— > 0 (л — со) 


(GS) 
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is satisfied uniformly with respect to k (— © < k < oo) if and only if 


D lim —xF (x)= lim x (1— F(x)) =a,* 
т > — со 2 > со 


2) condition (ш) is satisfied. 
2. For a certain constant o > 0 the relation 


on? a (0 (5) = 0 (n — со), 


сп? 
where 
0 for x <0, 
1 3 


e| i1 2 8 
lvz* x for x >0 


is satisfied uniformly with respect to k (— оо < k < œ) if and only if 
1) lim V|x|F(x) = lim Vx(1—F (x))=Y a, * 
@-»—со г-у оо 


2) condition (w) is satisfied. 
The proof of the theorem formulated above proceeds in the main on the 
arguments by which the Theorem of § 49 was proved. First of all, putting 


, A, 
* (б = 00е, 


we obtain the equation f 
тв} 


В. 218, (k) = f еу) at. 
—пВ» 


Furthermore, by the inversion formula, 
8 (2) =з | em itty (t) dt, 


where log v(t) is defined by formula (1) of § 34. 
To estimate the difference, 


Rn = 2n[B,8,, (k) — g (z)]. 
we represent №, as the sum of four integrals 


Ry = ho hat fact f 





* With respect to the condition 1) see Theorem 2 of $ 35. 
t As in § 49, 
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where 
A 
t 

K= f eet [se (g-)— vc Ja, 

—A 
К : t " [4 
j| es f entitfan( slat, ha f pup at, 
A<[li<seBy «Ву <lt(<rBy, 
I -f e- ity (f) dt. 


It 12A 


As proved in the preceding theorem, the integrals J,, Zs, and I, are as 
small as we wish for sufficiently large n and A, no matter how small the 
previously chosen e > 0 may be. (For the estimation of J, we must take 
into account the fact that by (1) of $ 34 


[19@1ш< со) 


We shall now prove that by choosing e sufficiently small and A suffi- 
ciently large, it is possible also to make the integral Г» as small as we wish. 
For this purpose we shall make use of the following lemma: 


Lemma. If the distribution function F(x) belongs to the domain of attraction 
of the stable law with the characteristic exponent a (0 < а < 2), then a 
constant с> 0 can be found such that in a sufficiently small neighborhood 
of the point t = 0 the inequality 


M (1 
ree G0, (1) 
holds,* where 
х0) =1— F(x) + FC— x) 
and 


F (x) =F (x) * 1 —F (— x + 0)]. 


_ Proof. By the assumption about the function F(x) it is clear that 
F(x) belongs to the domain of attraction of the stable law G(r) = G(x) 
Ж [1 — G(-2)], for which the characteristic function is 


v(t) == etant 


* As an exercise, the reader may verify that the inequality 


1 
rece To 


also holds, where у(х) = 1 — F(x) + Е (х). 
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Moreover, the normalizing constants B, for F(x) and F(x) may be chosen 
to be the same. Therefore, according to Theorem 2 of $35, for every 
u>OQOandz—> х 


x(ux) 1 


XQ) 
Now 


FA = [соз tx dF (x), 
Hence 
1 Ў = f (1 — cos tx) dF (x). 
For every t and x the inequality 1 — cos tr > 0 holds; hence 


17 5 f (1 — cos tx) dF (x). 


т 3r 


zijn ls 1] 





But in the domain of integration, 
1 < 1 —соѕіх < 2; 


consequently, 


1—70> f — Fe x(G) 














and for all t (|! € ©) 
тах (| о, |, | ®„|)< yEy- (2у] = Cy: 


From this it follows that for [| € e 


1—f( > 2x (15) 


* Translator’s note. In the original an equality sign is written, which is incorrect 
if F(x) is discontinuous. 
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and so 
~ yt 
; RE E 
1 — 2c (rj) e EAS 


Since 


FO=OP, 


the inequality obtained proves the lemma. 

We can now turn to the estimation of the integral Г». To this end we 
choose є > 0, so that the inequality (1) is satisfied in the interval {| < е. 
Then for |i! € Ba, we have 

n t 
=|") 


ОР 03. 


But for sufficiently large n [see $ 35, (9) and (10)], 











ny (o2) ~ a |4. (с, > 0).* 
Consequently, in all cases, for sufficiently large n, 
ny, (Ee) pe |t p. 
Therefore for |4 < «B, and sufficiently large n, 


| (x) « 
if B, xe 


CoC, 
2^ 


Шы 


Now 


cB, со 
— Slr eye > Cpa 
< f е? at< | e ? dt. 
A A 


The last integral can be made as small as we wish by the choice of 4. This 
completes the proof of the theorem. 


$51. IMPROVEMENT OF THE LIMIT THEOREM IN THE CASE OF 
CONVERGENCE TO THE NORMAL DISTRIBUTION 


We now turn to the improvement of the results of § 49. To do so we 
shall have to impose stronger conditions on the summands. Suppose the 
random variables £, can take only the values 


XQ a |51 (s=0, +1, —2, ...), 
* Translator’s note. In the original the constant c, is missing. According to § 34, 
and using the notations there, c, is related to the c occurring in v(t) as follows: 


c = aLla) cos Z aif a < ;с= аа = l; e = -cM (a) cos S aif a > 1. 
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where h is the maximum span of the distribution.* The random variable 


"n 
1 А 
Tn = B. a (5, — М) 
=i 


can take only the values 


— d Lh (s—np) : 
У = Yng oyn 


where 
со 


p= У sp, and p, 7 P (6, — a 4- sh}. 


s= — 


From the fact that 


faD= D 9, (5) сз, 


where Ф, (=) = Р{т„ = yns}, we find that, as in the preceding section, 


‚леп 


9, (s) = ae J Fn (e "nat. 
1 


тч" 


As before, т = 2r/h. We shall now prove the following proposition (Esseen 


[26]): 


THEOREM 1. If the identically distributed lattice random variables +, 
5, ..., En are independent and have finite absolute moments of the order 


К (k > 3) inclusive, then 





8, (s) = T 


Үп v=1 2 


n 





1 
Here ¢(yns) = —— 
Uns) мт 
in $ 41 by substituting е for Ф.{ 
Proof. We have 


хз yn 8, (5) == 1, + 15, 


оо У Р,(=— On) + 0 


1 
E 
n ? 


? 
Yns 
72 and P,(—¢) is defined as the function P,(—®) 


(1) 


* As already remarked in § 48, it is possible to confine ourselves to the case 
h = 1, as was done in $$ 49-50. We do not do this here for the sake of tnore con- 


venient comparison with the results of $$ 43-44. 


t Translator's note. P.( —4) is obtained [rom the polynomial P,(—1) defined in 
$38 by substituting qe? for w. Cf. (28) of $38 and also the statement of the 


Theorem in § 47. 
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where 
Assi ep жы, | 
Bas | [ne E |: 4 УР, J la 
1 : Vr у= 1 п? 
Sia 
and 
1 
tun, ix 
= | е? | dE Fu en ut dt 
-laYs Pu 


(henceforth we shall write y instead of yns). 
We shall suppose that Tin < тосул 2; otherwise the estimates will 
only be simplified. We can then write the equation 


1, == f h, (2) 
where 
Tin г k—2 А 
һ= | ЫГ + У Pat] sea 
7 "kn wml n? 
and ; 


t k—2 А 
po ши шшш 


Так CM усе YT 


Using Theorem 1(b) of $ 41, we find that 


ly =0 (=) (3) 


In Ta < | € rovn there exists a constant с > 0 such that |7,00)! < e-*", 
hence it is not difficult to see that 


L E (4) 
n 2 


n n k—2 ` 
= P, (it) | =; = 1 
[ена (z); 


But 





ә 


у= 1 


л 
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г k—2 f 
ae * [ie X6 arto er ы 
э=1 n? n 2 4 
Since 
72 k—a . 
f e 2 [ -+ у ai | ма 
э==1 т 


п 


E JBE Palate) |, i8 


у=1 n? 
the equations (1)-(6) prove the theorem. 
Аз a particular case of Theorem 1 for k = 3 we obtain: 


THEOREM 2. If the identically distributed lattice random variables $, 
&,..., En with maximum span h and ME, = 0 are independent and have 
finite third moments, then 


h 


8, (s) =e #0) HF RONY +o (=). 


From this local theorem it is possible to obtain the integral theorem 
of § 43. As in the deduction of the integral theorem of de Moivre-Laplace 
from the local theorem, here in the infinite sum 


Fa (Zns) = 2 $8, (r) 


the estimate of Theorem 2 is directly applied only to terms P(r) with 
indices r which are not far from p, while the terms with large differences 
lr — D| are estimated specially. 

From Theorem 1 for k > 3 it is possible to obtain corresponding esti- 
mates for F(z) with a remainder term of the order 


1 
ор |, 
п 2 


We shall not give these estimates here (see Esseen [26]). 
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NOTES ON CHAPTER 1 
by J. L. Doon 


It is possible, although undesirable and unnatural, to write a book on 
the limit distributions of sums of independent random variables with 
essentially the content of this book, but without the use of the name random 
variable anywhere in the text. This can be done as follows. Let £i, ..., х, 
be mutually independent random variables, with respective distribution 
functions F,,..., Fa, and respective characteristic functions fi, ..., fa- 
Then the distribution function of s, = xi-F --- 4- t4 is the convolution 
of the F;'s, and the characteristic function of s, is the product of the f;'s. 
The standard procedure is to investigate the distribution of s, by means 
of its characteristic function. One can carry through this work with no 
reference to the x,’s ог sa, by simply discussing the iterated convolutions 
of distribution functions, and the corresponding products of characteristic 
functions. In this book, since much of it is concerned only with char- 
acteristic functions, it would even be possible to phrase much of the 
material entirely in terms of characteristic functions, omitting reference 
both to random variables and distribution functions. Although this type 
of treatment is not uncommon in distribution theory, it would be unde- 
sirable in a large work of the present kind, since it would have been mis- 
leading to give the great quantity of material in this book without reference 
to the basic theory of probability which provides the context that gives 
the material its importance. 

Although not much of the basic theory of probability is needed for this 
book, the authors quite properly judged a short outline to be necessary, 
since there is no reference book available in any language which covers it 
properly. It is an interesting fact that with all the research going on in the 
theory of probability, there is still no text (such as there are in great 
numbers, for example, in the theory of functions of a complex variable) 
which starts from the beginning, makes all the necessary definitions, with 
a proper discussion of each, and proves the basic theorems, thus assuring 
the student that only the analytical details peculiar to particular develop- 
ments remain to hinder him. The point is not that there is no good book 
which covers this material, but that no book has even been written with 
this purpose in mind. This appendix is written to elaborate in more detail 
some aspects of the basic theory which appear in rather compact form in 
Chapter 1. 

There is now essential unanimity among mathematicians working in 
probability that, for mathematical purposes, an event is a measurable 
point set, the probability of an event is the measure of the point set, a 
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random variable is a measurable function, and the expectation of a random 
variable is the integral of the function. It is not generally realized, how- 
ever, that these basic definitions do not suffice to set up the theory of 
probability, but that considerable elaboration is necessary, along the 
following lines. 


P1 The various models for families of random variables and associated 
measure spaces must be treated. 


P2 Conditional probabilities and expectations must be defined, and their 
properties established. 


P3 The basic probability measure must be discussed, in terms of the 
desirability of suitable restrictions and canonical modifications. 


These three topics have not been listed in the order in which they 
would be treated in a systematic text, but in the order in which it will be 
convenient to treat them here. In the following we shall suppose, until we 
come to the discussion of РЗ, that the given probability measure satisfies 
only the restrictions (ul) and (u2) given in $2. The convenience and 
necessity of further restrictions will be discussed later. 

Remarks on P1. Random variables were defined above. Their existence, 
satisfying specified conditions, is a separate question, usually solved by 
the use of mathematical models. 

For example, consider Theorem 4, § 21, whose statement begins: Let 


bi, fo, ..., фу... 


be a sequence of mutually independent random variables and let the distribution 
function of § be F(z). The theorem is quite correct without an answer to 
the following question, but its importance is considerably enhanced by the 
fact that the answer is affirmative. The question is: “If the Ёз are speci- 
fied, is there a corresponding sequence of mutually independent random 
variables, defined on some measure space?” It would be unfortunate 
if the answer to this question depended on the particular distributions 
involved. Actually, there is always such a sequence of random variables, 
and the random variables can be taken as the coordinate variables of an 
infinite dimensional coordinate space. In fact, Kolmogorov showed [65] 
that a family of random variables {&, œ € T] indexed in an arbitrary 
set T can be defined if, for each finite index set aj, .. . , аһ, the n-variate 
distribution of £4,..., an is specified, as long as these specified finite 
dimensional distributions are compatible. Compatibility means that, 
if m < n, the marginal distribution of £4, . . . , £4, obtained from the above 
is exactly the specified distribution of £4, ..., Ean. Kolmogorov defined 
the £,'s as random variables on a coordinate space with £, the ath coordinate 
variable, defining a measure of sets in this coordinate space in such а way 
that the £;'s have the desired distributions. In particular, if any family 
of random variables {&, a € T] is given, the distributions of finite sets 
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of these random variables determine, according to this method, a measure 
in a coordinate space whose dimensionality is the cardinal number of Т, 
and a family {£, a € T] of coordinate random variables of this space, 
with the property that, for any finite index set a1,...,a,, the two sets of 
random variables 


D , 
Eai; Жк | £o. bony gee ang, £o, 


defined on different spaces, have the same n-variate distribution. Thus 
distribution problems involving the original variables can be stated in 
terms of the coordinate variables, and are frequently thereby simplified. 
For example, if there is a finite number n of random variables in the given 
family, that is, if the index set T contains only n points, distribution 
problems are reduced to problems involving ordinary distributions in 
n-dimensional space. In the latter case, of course, the n-dimensional 
distribution is derived from the given random variables by the simple 
map described in $ 2. The given n random variables map the basic space of 
elementary events into a subset of n-dimensionai space; this map is used 
to define a measure in n-dimensional space in such a way that the transfor- 
mation between the two spaces is measure preserving. In the most general 
case, a family of random variables (#4, a € T] is mapped into a family 
fta е € T]. More precisely, the elementary event space on which the ё” 
are defined is mapped into the coordinate space on which the £,'s are 
defined. The function £; goes into the function £e, and the map preserves 
(probability) measure. It is desirable to have the closest possible relation 
between x's and £,'s, and the hypothesis to be discussed below, that the 
basic probability measure is perfect, is a step in this direction. 

Care must be taken to differentiate between the existence of a random 
variable defined on the given measure space and the existence of a distribu- 
tion with specified properties. Аз an example, consider the definition of 
an infinitely divisible random variable, given in $ 17. The random variable 
£ is there said to be infinitely divisible if, for every positive integer n, it can 
be expressed as the sum of n independent identically distributed random 
variables. Note that this definition imposes two requirements, for each 
value of n: (i) the characteristic function of £ is to be the nth power of a 
characteristic function; (ii) the structure of the given measure space is 
complex enough to support n identically distributed independent random 
variables with sum £. Actually the first (weaker) requirement is all that 
is usually of interest, and all that is treated in this book. It is true, although 
not proved in this book, that to every distribution whose characteristic 
function has the property (i) there is a probability measure space and a 
random variable defined on it whose distribution has this characteristic 
function and satisfies (ii). It is not true that if a random variable satisfies 
(i), it also satisfies (ii). In fact, if £ is a random variable with a Poisson 
distribution, (i) is satisfied, that is, £ has an infinitely divisible distribution, 
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but the representation = £i + £j, where £i, & are independent and have 
a common distribution, may or may not be possible, depending on the 
basic space. In fact, if £ has expectation 1, £i and £ must each have a Poisson 
distribution with expectation 5, and if the basic space is a sequence of 
points, each point corresponding to a single value of £, it is easy to sce that 


the stated representation is impossible. (The only probabilities that 
© 
exist are sums of terms of the series e^! > 1/n!, and the probability е2 


0 
that & = 0 cannot be expressed in this form, as an elementary examination 
of possibilities shows.) 

Remarks on P2. There is no reason to give a detailed discussion of condi- 
tional probabilities and expectations here. However, the following is an 
example of the difficulties that can arise, due to the fact that conditional 
probabilities and expectations are not uniquely defined. Suppose that 
conditional probabilities of sets have been defined, relative to some specified 
conditions which we omit. Then it is important to know when these condi- 
tional probabilities can be treated as ordinary probabilities, that is, as 
defining probability measures in terms of which integration yields condi- 
tional expectations. This is not always possible, but it is possible, and even 
trivially simple, if the condition is a specified one of positive probability. 
This special case is all that is needed for the present book, and is treated 
in Chapter 1. 

Remarks on P3. We accept without argument the hypothesis that the 
measure in question is completely additive. This is, of course, not necessary 
as а requirement either philosophically or mathematically, but there is no 
present indication that there is any advantage in treating finitely additive 
measures. 

The authors incorporate completeness of a measure [condition (u3) of $ 2] 
as part of their measure definition. This condition is in no way relevant to 
the needs of the book, but it is frequently convenient in measure studies, 
and it is harmless in the following sense. А measure which is not already 
complete can be made complete by adding to the class of measurable sets 
every set which is not measurable but which has the property that there 
are two measurable sets of the same measure, one containing the set and 
the other contained in it. The nonmeasurable set is assigned as measure the 
common measure of these two measurable sets. This operation of completion 
not only leaves unchanged the measures of the given measurable sets, but 
adds to the elass of measurable sets those and only those sets whose 
measures are uniquely determined by the given measure function if they 
are to be assigned measures. 

'The authors add a further condition to their measures; they are to be 
perfect. This condition is also quite unnecessary for the purposes of this 
book. The following remarks will, however, give some indication of the 
advantages to be gained from this restriction, which is certainly a con- 
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venience, although not a necessity. We shall first find a condition necessary 

and sufficient that a measure be perfect. This condition is really onlv a 

slight rephrasing of the definition, but gives insight into its significance. 
Let F be any distribution function of one variable. Define the outer 


measure of a linear set А as inf 2 (F(bj) — F(a;)], where U [a;,b;) is any 


union of semiclosed intervals (including left-hand end malin only) which 
covers A. This outer measure determines a measure in accordance with 
the usual Carathéodory method. We shall call the measurable linear 
sets obtained in this way F-measurable. This measure, sometimes called 
Lebesgue-Stieltjes measure based on F, is complete, and in fact the F 
measure considered defined only on the linear Borel sets yields F measure 
when completed as described above. 

Now let £ be any random variable defined on a measure space satisfying 
conditions (ul), (u2), (u3) of $ 2, and let F be the distribution function 
of &. In the notation of $ 2, the map £' = # (и) maps the space U of ele- 
mentary events on which the given probability measure y is defined, onto 
a subset of the line U’. A measure и” of linear sets is defined by setting 


и'(А) = u[£ (A)! 


for every А with the property that £^!(4) is u-measurable. This defines 
a measure on a certain Borel field My of U” sets. If we make A an interval 
here, we find that р’ and F measures are the same on intervals, and there- 
fore on all Borel sets. It is clear that u’ measure is complete. Hence, by 
definition of F measure, the two measures are equal on the class of F-mcas- 
urable sets. The two measures need not be identical, however, because M, 
may contain sets which are not F-measurable. 


THEOREM. The u measure ts perfect if and only if, for every £, ш and F 
measures are identical. 


Suppose first that ш measure is perfect. This means that, if A is p- 
measurable, there is, for each positive integer n, an open set containiag 
A and of u’ measure at most u'(A) + 1/n. The intersection of a sequence of 
open sets obtained in this way for each value of n is a Borel set А» con- 
taining A, with и'(А) = u'(Az). Applying this argument to the complement 
of A, we find a Borel set Ai, contained in А, the union of a sequence of 
closed sets, with u’(A) = u'(Ai). Now A; and АЛА» are Borel sets, and as 
such are F-measurable, with F measures the same as their w’ measures. 
Since F measure is complete, it follows that A is also F-measurable, 
because it lies between two F-measurable sets of the same measure. Thus, 
if р measure is perfect, ш’ and F measures are identical, in domains of 
definition and values. Conversely, if these two measures are identical for 
every choice of £, the p’ measure shares with F measure the property that 
the u’ measure of a set is the lower limit of the u^ = F measures of contain- 
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ing open sets, and this property is the defining property of a perfect 
measure. (The fact that F measure has the stated property will be found 
in any text which discusses Lebesgue-Stieltjes measures.) 

Thus the essential significance of the hypothesis that the basic measure 
is perfect is that under the natural map of the space of elementary events 
into a line, discussed in $ 2, determined by a random variable, or more 
generally under the map into Euclidean n-space determined by n random 
variables, the only Euclidean space sets whose inverse images are measur- 
able are the Borel sets and the sets obtained from these by completing 
the measure of Borel sets in n-space determined by the multivariate 
distribution function of the given n random variables. Since this map is 
а useful tool in reducing problems involving n random variables to those 
involving the coordinate variables in Euclidean n-space, it is sometimes 
convenient to have the simple relation just described. 

'Two examples will now be given illustrating on the one hand the con- 
venience of the hypothesis that the basic probability measure is perfect, 
and on the other the fact that this hypothesis is by no means a necessity. 

Let £, £ be random variables, defined on the same measure space. 
Then for the purposes of probability theory it is most useful to define £i to 
be independent of & if, for every pair of linear Borel sets Ai, Аз, 


(I-1) P(t& € A, & € Ao} = Pít € Ai] P{& € Ag}. 


(It is easy to see that this equation holds as stated if it holds whenever A; 
and А» are intervals, or even intervals with —oo as left-hand end point, 
and the condition is commonly stated with A; and А» of this form.) On the 
other hand, it would be natural from some points of view, and somewhat 
more elegant, to prescribe that (1-1) be true whenever A; and А» are sets 
of real numbers such that the right side of (1-1) is defined, that is, such 
that the two sets of elementary events involved on the right are measurable. 
This second definition, which is the one actually given by Gnedenko and 
Kolmogorov, is (almost trivially) equivalent to the first if the basic prob- 
ability measure is perfect. The two definitions are not necessarily equivalent 
otherwise,* so that, without this restrietion on the basic probability 
measure, the definition of independence given in this book would be 
modified to that given at the beginning of this paragraph. 

As & second example, consider the problem of defining conditional 
probabilities and expectations. The details will not be given here, but 
if the basic measure is perfect, the problem mentioned under P2 has an 
affirmative answer. Without this hypothesis, conditional probabilities 
must be treated with somewhat greater care, but conditional probabilities 
nevertheless act about like probabilities for most purposes. For example, 


* See B. Jessen, Coll. Math. 1, 214-215 (1948), and J. L. Doob, ibid. pp. 216-217 
for examples in which the two independence definitions are not equivalent. 
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the usual integral (expectation) inequalities associated with the names 
Bunyakovski, Cauchy, Euler, Iölder, Jensen, Minkowski, Schwarz. and 
so on are valid for conditional expectations just as they are for expecta- 
tions. The point is that any problem involving a family of random vari- 
ables can be reduced, even if the basie measure is not perfect, to a cor- 
responding problem involving the coordinate variables of a coordinate 
space (see the discussion of P1) for which the problems associated with 
nonperfect measures do not arise. As is true in certain other problems, 
the important fields must be discovered and analyzed. 

We conclude with a problem of a different type. In any discussion of 
continuous parameter families of random variables, as in the discussion 
of Brownian motion, it is desirable to make statements on the distributions 
of the superior limit of a nondenumerable collection of random variables, 
and on the continuity in the indexing parameter of sample funetions of 
the family. The probabilities needed in such an analysis are not necessarily 
defined, that is, the corresponding classes of elementary events are not 
necessarily measurable, even if the basic probability measure is complete 
and perfect. Thus some standard method must be accepted, either for 
reinterpreting probabilities, or for modifying the basic measure space 
or measure defined on it, to make it possible to define the desired prob- 
abilities.* 


* See J. L. Doob, Stochastic Processes, New York (1953). 


APPENDIX II 
NOTES ох $32 


Theorem 1 of $ 32 needs amplification. 

First, the so-called “distribution function" V(z) there is not neces- 
sarily continuous to the left. In fact, it is not difficult to see that if F(x) is 
unimodal with verter at x = 0, then the left derivative Е! (х) is continuous to 
the left, and the right derivative Е! (x) is continuous to the right for every x < 0 
Hence at a point х where №! (x) = ЕЁ! (х), the function F(z) — zF'(x) = V(x) 
will be coatinuous to the left or to the right according as F'(z) is taken to 
be the left or the right derivative at this point. This indetermination 
of V(x) at discontinuity points is inessential, but it conflicts with the 
defin:tion adopted in the book (see Chapter 1, § 6). 

In the sufficiency part of Thcorem 1, namely, in the statement ‘if 
V(z) = FG) — zF'(x) is a distribution function then F(x) is unimodal,” 
it is not vlear what preliminary assumption is made on the distribution 
function F(x). This is caused again by the ambiguity of F'(z). Suppose 
that the preliminary assumption were that either F' (х) exists everywhere 
or else F^ (x) exists everywhere; then the statement is clearly incorrect. 
To see tis, we пева oniy consider any (left-continuous) distribution which 
is a step function with more than one jump; then Ё! (x) = 0 everywhere 
and F(a) — zF'(x) reduces to F(x) itself, but F(z) is not unimodal. In 
this example, of course, F4(x) is infinite апа F(x) discontinuous at some 
point = #0. The situation may be remedied by requiring that both 
F(x) — zF! (x) and F(z) —xF4(z) lie between V(r —0) and V(x +0) 
(both inclusive), for then F' (x) and F1 (х) will be finite and F(z) continuous 
for every x ~ 0. A simpler formulation is given in Theorem 1(b) below. 

After these remarks we now give a precise version of Theorem 1. The 
term ‘‘distribution function” is used in the strict sense. 


THEOREM 1. (a) If the distribution function F(x) is unimodal with vertex 
at x = 0, then there exists a distribution function V (x) such that 


F(x) — xF! (x) = V(x) 
and 
F(z +0) – ХЕ; (2) = Vie+ 0) 


for every x. (A product 0 · оо is taken to be 0.) 
(b) Let the distribution function F(x) be continuous except possibly at 
= 0. Suppose that there is a denumerable set D of points x and a distribu- 
tion function V(x) such that if x is not in D, the right or left derivative 
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F'(x) (possibly different ones at different points) exists and satisfies the 
equation 


F(z) — 2F'(x) = V(z); 
then F(x) is unimodal. 


The procf of the theorem proceeds as in the text. In part (a) we use 
the italicized proposition given in the beginning of this Appendix. In 
part (b) we use the following theorem.* 

If F is a continuous function on an interval I, and if at each point of this 
interval, except those of a denumerable set, one at least of the four Dini deriv- 
atives is equal to zero, then the function F is constant on I. 

Theorem 3 of $ 32, attributed to A. I. Lapin, asserts that ‘The composi- 
tion of two unimodal distribution functions with vertex O is unimodal with 
vertex O." The proof given of this theorem proceeds as follows. Let 
F(x), 1 = 1,2 be unimodal with vertex О, and let F;(x) denote the left de- 
rivative of F;(z). By Theorem 1, the two functions 


V(x) = Fi(v) — zFi(x) (i = 1,2) 
are distribution functions. If 
F = FykF.-FKF,, 
then 
F'(z) = (Fi F) (а) = E [re — г) dF«(z) = (Fi X F2)(z) 
= (FE Fo) = ОРЖ Fa) (a) + (РЖ Р) (8). 
It follows that 


F(x) — zF'(z) = МЕ, ЖЕ, + Fo Fi) — ДЕ Ж Ёз + PE Fi] 
= M, 7 zFD) ЖЕ. + (F2—2F) Ж Fi) = V1 Ж Fe + Ve Ж Fil. 


The right side of the last equation is a distribution function; hence by 
Theorem 1 F(z) is unimodal with vertex О. 

There are two errors in this proof. First, the equation F’ = F; ж Р holds 
only if (х) is continuous at т = 0, hence absolutely continuous on 





* See Saks, Theory of the Integral, 2nd ed., Stechert, New York (1937), p. 272. 
If we assume, as is sufficient for our purpose, that at least one of / (х) and F^ (x) is 
equal to zero everywhere in J except in a denumerable set of points, then the con- 
clusion of the theorem can be proved by the following simple argument due to 
P. Erdés. If F(z) is not constant we may suppose that there are two points a and b 
in Z such that a < b and F(a) < F(b). Let F(a) < c < d < F(b) and consider the 
straight line L(d) passing through (a,c) and (b,d). Let zo(¢) be the supremum of the 
points in (a,b) at which the curve y = F(z) is strictly below L(d). Obviously, neither 
F' (хо) nor Е’, (хо) can vanish. Two distinct values of d correspond to two distinct 
values of zo(d). Hence there is a set of points in (a,b) of the power of the con- 
tinuum at which neither derivative vanishes. 
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(— 5, +). Even in this case the differentiation under the integral needs 
justification. This can be done by Fubini’s theorem, Lebesgue's con- 
vergence theorem, and the fact that Ё{ Ж Fz is continuous except possibly in 
a denumerable set. On the other hand, if Lapin's theorem were true for 
continuous F, and F;, then it would be true also in general. This is easily 
seen if we write each F; as the sum of its jump at x = 0 and a continuous 
part to which the result applies. 
The second error lies in the fact that 


Fy Fa — (Fy Fo) = (Fi — Fi) Ж Fe 


is not an identity. This error cannot be repaired to save the theorem.* 
In fact, the statement of the theorem itself is false, as the following trivial 
example shows. Let 

х+ў—%<х<$, 

0 otherwise. 


F(z) -Í 


Then, according to the definition, F is unimodal with vertex О, but FF 
is unimodal with (the unique) vertex 4. 

The statement of Theorem 3 remains false even if no specification is 
made about the vertices. We shall give an example in which F is unimodal 
with vertex О and absolutely continuous, but F æ F is not unimodal at all; 
in fact, its derivative (density function) is continuous and has two relative 
maxima. 

Example. 


(Oif z < —gs, 
_ 50-5 <2<0, 
Peiper em 
0if$ < 1. 


F(z) = f p(z) dz. 


The derivative of F Ж F is then given by 


© 


p: (0) = f ple – ра) dz. 


—9 


* For a result which can be obtained by correcting the error, see K. L. Chung, 
Sur les distributions unimodales, C. R. Acad. Sct. Paris, 236, 583-584 (1053). 
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Elementary computation gives the following explicit formula: 


0 if £ < —+, 
25x + $ if —+{5 < r < -a3 
-15r +4 if -5 <r <0, 
p (х) = r+gifO0<r<F, 
~Or+22ifb Sar <8 
-rifè sa <8, 
0 if $ € у. 





Thus р, (х) has two relative maxima at —35 and $ with the values $ and 42 
respectively, and a minimum at 0 with the value 4. 

We may, if we wish, modify this example in such a way that plx) is 
continuous everywhere and attains its maximum at a unique point, while 
pe (x) still has more than one relative maximum. This follows from con- 
siderations of continuity. 

Theorem 5 of $ 32, attributed to Gnedenko, states that ‘all distribution 
functions belonging to the class L are unimodal.” The proof given depends 
essentially on the false Theorem 3 and therefore is not valid. Thus this 
interesting statement remains a conjecture. It is not even known whether 
all stable laws are unimodal. The only results known in this direction seem 
to be those of Wintner * which state that “the composition of two sym- 
metrical unimodal distribution functions is symmetrical unimodal,” and 
consequently that “all symmetrical stable laws are unimodal.” 

In this translation the original Theorems 3 and 5 are omitted and 
Theorem 4 is renumbered Theorem 3. As a consequence, proposition 1 
of § 36 which states that ‘all stable laws are unimodal” is also omitted, 
and the subsequent propositions renumbered accordingly. Owing to this 
revision the material in § 32 will have no bearing on the rest of the book. 
However, we deem Theorems 1 and 2 of $32 of sufficient independent 
interest to be included in the translation. 


* A. Wintner, Asymptotic Distributions and Infinite Convolutions. Vidwards 
Brothers, Ann Arbor, Michigan (1938), pp. 30 and 32. 
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